HTML tutorial

Resources

Technologies Developed

Datasets Released

Description

miRPipe: A Unified Computational Framework for a Robust, Reliable, and Reproducible Identification of Novel miRNAs from RNA Sequencing Data (2022) new

What is it used for?
In this study, we have designed miRPipe, an end-to-end computational framework for the identification, characterization and expression estimation of small RNAs, including the known and novel miRNAs and previously annotated piRNAs from small-RNA sequencing profiles. The miRPipe workflow detects unique novel miRNAs by incorporating the sequence information of seed and non-seed regions, concomitant with clustering analysis. We have benchmarked miRPipe with seven state-of-the-art bioinformatics pipelines for three cancer types (CLL, Lung Cancer, and Breast Cancer). We have developed a free dockerized version of miRPipe for deployment in the clinical setup which is easily usable for both beginner and expert bioinformatician.
Citations Policy
Please cite the below paper if you use our miRPipe tool in your research work:
Tool Citation:
  1. Vivek Ruhela, Anubha Gupta, K. Sriram, Gaurav Ahuja, Gurvinder Kaur and Ritu Gupta, "A Unified Computational Framework for a Robust, Reliable, and Reproducible Identification of Novel miRNAs From the RNA Sequencing Data," Frontiers in Bioinformatics, Frontiers, vol. 2, no. 842051, pp. 01-20, July 08, 2022, https://doi.org/10.3389/fbinf.2022.842051.
Access to miRPipe Tool
Please visit the following:
  1. https://github.com/vivekruhela/miRPipe.
  2. miRPipe DockerHub Link: https://hub.docker.com/r/vivekruhela/mirpipe
Please download the tool using the Docker Command:
  1. docker pull docker.io/vivekruhela/miRPipe

Description

miRSim: Seed-based Synthetic Small Non-Coding RNA Sequence Simulator (2022) new

What is it used for?
We have designed a synthetic sequence simulator for next-generation based RNA-Seq data, named as miRSim. The miRSim tool utilize the sequence information (seed and xseed region, remaining part of miRNA sequence) and provides the functionality for performance assessment of next-generation sequencing-based RNA-Seq data processing tools. The sequence information is very useful for accurate detection of miRNAs that includes the identification of known miRNAs, novel miRNAs, and functionally similar miRNAs (paralogues). We have used the synthetic data generated using miRSim for the benchmarking of miRPipe workflow with seven state-of-the-art bioinformatics pipelines for the identification of miRNAs.
Citations Policy
Please cite the below paper if you use our miRSim tool in your research work:
Tool Citation:
  1. Vivek Ruhela, Anubha Gupta, K. Sriram, Gaurav Ahuja, Gurvinder Kaur and Ritu Gupta, "A Unified Computational Framework for a Robust, Reliable, and Reproducible Identification of Novel miRNAs From the RNA Sequencing Data," Frontiers in Bioinformatics, Frontiers, vol. 2, no. 842051, pp. 01-20, July 08, 2022, https://doi.org/10.3389/fbinf.2022.842051.
  2. Vivek Ruhela, Ritu Gupta, Sriram K., Gaurav Ahuja, & Anubha Gupta. (2021). miRSim: Seed-based Synthetic Small Non-coding RNA Sequence Simulator (V1.0.0). Zenodo. https://doi.org/10.5281/zenodo.6546356
Access to miRSim Tool
Please visit the following:
  1. https://github.com/vivekruhela/miRSim.

ECGdetect: Explainable AI Decision Model for ECG Data of Cardiac Disorders (2022) new

What is it used for?
In this study, we have designed a new deep learning based AI model, namely ST-CNN-GAP-5, using a publicly available dataset of PTB-XL of ECG signals for the detection of cardiac disorders. We have analyzed the ECG data using SHapley Additive exPlanations (SHAP) on the trained ST-CNN-GAP-5 to assess the explainability or interpretability of the decisions of this deep convolution network model. The model highlights relevant alterations of the ECG waves as required by clinicians, making it explainable for diagnostic purposes. Deployment of such models can help in easing the burden on medical infrastructure in low-and-middle-income populous countries.
Citations Policy
Please cite the below paper if you use our ECE Detect Tool:
Data Citation
  1. Atul Anand,Tushar Kadian, Manu Kumar Shetty, and Anubha Gupta, “Explainable AI Decision Model for ECG Data of Cardiac Disorders,” Biomedical Signal Processing and Control, 2022. (2022 IF: 3.880)
Access the ECG Detect Tool
Please visit: https://ecgdetect.sbilab.iiitd.edu.in/.

MERC: Novel Machine Learning Extra Tree Model for Risk Prediction After ST-Elevation Myocardial Infarction (2021)

What is it used for?
In this project, we have designed and developed an extra-tree based AI model (MERC) to predict all-cause 30-day mortality among the patients suffering with ST-Segment Elevation Myocardial Infarction (STEMI). MERC utilizes the STEMI data of the North India ST-Elevation Myocardial Infarction (NORIN-STEMI) registry and included 3,635 patients presenting with STEMI at two tertiary care hospitals in New Delhi, India.
Citations Policy
Please cite the below paper if you use MERC predictor in your research work:
Data Citation
  1. Manu Kumar Shetty, ..., Anubha Gupta, and Mohit Gupta, "Machine learning based model for risk prediction after ST-Elevation myocardial infarction: Insights from the North India ST elevation myocardial infarction (NORIN-STEMI) Registry," International Journal of Cardiology, Elsevier, vol. 362, pp. 6-13, September, 2022. DOI: https://doi.org/10.1016/j.ijcard.2022.05.023.
  2. Arora S, Qamar A, Gupta P, Vaduganathan M, Chauhan I, Tripathi AK, et al. Design and rationale of the North Indian ST-Segment Elevation Myocardial Infarction Registry: A prospective cohort study. Clin Cardiol. 2019; 42:1140-46. DOI: 10.1002/clc.23278.
Access the MERC Predictor
Please visit: http://merc.sbilab.iiitd.edu.in/.

Consensus based Risk Staging System (CRSS) calculator for newly diagnosed multiple myeloma patients (2021)

What is it used for?
CRSS is an efficient and robust AI-enabled risk-staging system for MM patients that utilizes ethnicity-specific cutoffs of key prognostic parameters. It's utility has been validated for Newly diagnosed Multiple Myeloma (NDMM) patients. It is a reliable tool for upfront risk stratification of MM patients and can help the clinicians/doctors in designing and providing effective therapy to MM patients.
How does it work?
It predicts the risk stage of a patient depending on the values of the seven parameters- age, albumin, β2m, hemoglobin, calcium, eGFR and high risk cytogenetic abnormalities [del 17p; t(4;14); t(14;16)]. Risk stage is predicted based on the classification rules obtained via decision tree classifier. A screenshot of the CRSS calculator is shown below.
Citations Policy
Please acknowledge the related publication by including the following citation in your work:
Data Citation
  1. Akanksha Farswan, Anubha Gupta, Sriram K., Atul Sharma, Lalit Kumar, Ritu Gupta, “Does ethnicity matter in multiple myeloma risk prediction in the era of genomics and novel agents? Evidence from real-world data,” acccepted in Frontiers in Oncology, October 2021. (2021 IF: 6.244) [Calculator]
Access to the calculator
Please visit: sbilab.iiitd.edu.in/pub_files/CRRScalculator_edit.html.

MRS: An AI-supported risk-staging system for newly diagnosed multiple myeloma patients (2021)

What is it used for?
Modified Risk Staging (MRS) calculator is a robust AI-enabled risk-staging system for upfront risk stratification of newly diagnosed multiple myeloma (NDMM) patients. It is an efficient and readily employable risk prognostication method which is beneficial for the settings where genomics tests cannot be performed owing to geographical/economical constraints.
How does it work?
It predicts the risk stage of a patient depending on the values of the key prognostic parameters- age, albumin, β2-microglobulin (β2m), hemoglobin, calcium and estimated glomerular filtration rate (eGFR). Risk stage is predicted based on the classification rules obtained via decision tree classifier. A screenshot of the MRS calculator along with the decision tree classification rules, is shown below.
Citations Policy
Please acknowledge the related publication by including the following citation in your work:
Data Citation
  1. Akanksha Farswan, Anubha Gupta, Ritu Gupta, Saswati Hazra, Sadaf Khan, Lalit Kumar, and Atul Sharma, “AI-Supported Modified Risk Staging for Multiple Myeloma Cancer useful in real-world scenario,” Translational Oncology, Elsevier, vol. 14, no. 9, article no. 101157, pp. 1-9, September 2021, doi: 10.1016/j.tranon.2021.101157. (2021 IF: 4.243) [PDF] [Calculator]
Access to the calculator here
Please visit: sbilab.iiitd.edu.in/pub_files/MRScalculator_DecisionTree.html.

SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images Dataset (2021)

Data Description
The data is collected from the subjects suffering from Multiple Myeloma (MM), who came with the symptoms of cancer for diagnosis and/or who are under treatment at the AIIMS, New Delhi, India. Microscopic images were captured from bone marrow aspirate slides of patients diagnosed with MM. MM is a type of white blood cancer, where the plasma cells of blood are involved. Slides were stained using Jenner-Giemsa stain and plasma cells are required to be segmented. Images were captured in raw BMP format using two cameras with:
  1. size of 2040x1536 pixels using cellSens software Version 2.1 (Olympus) attached to the microscope
  2. size of 1920x2560 pixels from a Nikon camera attached to the microscope.
A total of 775 images are stain color normalized using our in-house methodology. These are divided into the 1) training set of 298 images, 2) Validation set of 200 images, and the test set of 277 images. The dataset was used in the IEEE ISBI 2021 medical image challenge dataset. The leaderboard of the challenge is active. The ground truth of the training and validation dataset are provided, while the GT of the test set will not be shared. The researchers can check the performance on the test dataset by uploading results at the leaderboard at https://segpc-2021.grand-challenge.org/evaluation/final-test-phase/leaderboard/.
Access Data Here
The dataset is available at the IEEE Dataport repository. Please visit https://ieee-dataport.org/open-access/segpc-2021-segmentation-multiple-myeloma-plasma-cells-microscopic-images to download the dataset.
Citations & Data Usage Policy
Please acknowledge both this dataset and related publications by including the following citations in your work::
Data Citation
  1. Anubha Gupta, Ritu Gupta, Shiv Gehlot, and Shubham Goswami, "SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images", IEEE Dataport , DOI: https://dx.doi.org/10.21227/7np1-2q42.
Publication Citation
  1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis, vol. 65, Oct 2020. DOI: https://doi.org/10.1016/j.media.2020.101788. (2020 IF: 11.148)
  2. Shiv Gehlot, Anubha Gupta and Ritu Gupta, "EDNFC-Net: Convolutional Neural Network with Nested Feature Concatenation for Nuclei-Instance Segmentation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Barcelona, Spain, 2020, pp. 1389-1393.
  3. Anubha Gupta, Pramit Mallick, Ojaswa Sharma, Ritu Gupta, and Rahul Duggal, "PCSeg: Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma," PLoS ONE 13(12): e0207908, Dec 2018. DOI: 10.1371/journal.pone.0207908 .

IDD and TDC EEG Dataset (2020)

Description
  1. Dataset: The dataset contains EEG recordings of 7 participants with Intellectual Developmental Disorder (IDD) and 7 Typically Developing Controls (TDC) under resting-state and under music stimuli. The EEG data were acquired using the EMOTIV EPOC+ device that is a 14-channel dry electrode device. The IDD participants were recruited from Nai Disha School, Tamana NGO, New Delhi, India and were identified under the IDD category as per the standard ICD-11 guidelines. Additional information of the IDD group like Age, clinical scores (IQ and SQ) and behavioural data (music perception and facial expression) was also collected. Detailed description of the dataset is mentioned in Elsevier’s Data in Brief article mentioned below.
  2. Pre-processing pipeline: The dataset comes along with a robust and full-automated pre-processing pipeline for EEG data pre-processing. The pipeline can be easily used for pre-processing of raw EEG data of different data types (for example .csv/.mat/.edf ). It can automatically identify the significant artifacts based on the spatial and temporal features extracted from the raw EEG data. The pipeline provided is built for Emotiv EPOC+ device however it can be used for other devices as well given a channel location file in standard format (i.e. .ced) is provided. Elaborate description of the pre-processing pipeline and the guidelines to use it are stated in the Elsevier’s Data in Brief article and Mendeley repository mentioned below.
Access Data Here
The dataset and pipeline are publicly available at the Mendeley data repository. Please visit https://data.mendeley.com/datasets/fshy54ypyh/2 to download the dataset.
Citations & Data Usage Policy
Please acknowledge both this dataset and related publications by including the following citations in your work::
Data Citation
  1. Ekansh Sareen, Lakshya Singh, Anubha Gupta, Blessin Varkey, and Krishnaveni Achary, “EEG dataset of individuals with intellectual and developmental disorder and healthy controls while observing rest and musical stimulus [dataset],” Mendeley Data, vol. v1, 2020, DOI: http://dx.doi.org/10.17632/fshy54ypyh.2
Publication Citation
  1. E. Sareen, L. Singh, A. Gupta, R. Verma, G. Krishnaveni Achary and B. Varkey, "Functional Brain Connectivity Analysis in Intellectual Developmental Disorder during Music Perception," in IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020; DOI: https://doi.org/10.1109/TNSRE.2020.3024937.
  2. Ekansh Sareen, Lakshya Singh, Blessin Varkey, Krishnaveni Achary, Anubha Gupta, "EEG dataset of individuals with intellectual and developmental disorder and healthy controls under rest and music stimuli," Data in Brief, Elsevier, Vol. 30, 2020, 105488, ISSN 2352-3409, DOI: https://doi.org/10.1016/j.dib.2020.105488.
  3. Ekansh Sareen, Anubha Gupta, Rohit Verma, Krishnaveni Achary, Blessin Varkey (2019). Studying functional brain networks from dry electrode EEG set during music and resting states in neurodevelopment disorder, bioRxiv 759738 [Preprint], https://www.biorxiv.org/content/10.1101/759738v1

C-NMC 2019 Dataset (ISBI 2019 Challenge Data)

Data Description
The dataset contains images of "normal cells" and "malignant/cancer cells". This dataset can be used to train classification models for Acute Lymphoblastic Leukemia (ALL) diagnosis. Detailed description and usage guidelines of data are available at TCIA link below:

This dataset was used for ISBI-2019 challenge organized by SBILab. The test set predictions of this dataset can be evaluated at challenge leaderboard.

The top entries of this challenge are: 

 

                                Top Entries of C-NMC 2019 Challenge

           Name

            Rank

            Weighted F1 Score

     Yongsheng Pan

              1

                     0.910

      Ekansh Verma

              2

                     0.894

      Jonas Prellberg

              3

                     0.889

        Fenrui Xiao

              4

                     0.885

          Tian Shi

              5

                     0.879

         Ying Liu

              6

                     0.876

     Salman Shah

              8

                     0.866

        Yifan Ding

              9

                     0.855

       Xinpeng Xie

             10

                     0.848

Access Data Here
The dataset is publicly available at The Cancer Imaging Archive (TCIA). Please visit https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=52758223 to download the dataset.
Citations & Data Usage Policy
Please acknowledge both this data set and TCIA in publications by including the following citations in your work:
Data Citation
  1. Anubha Gupta and Ritu Gupta (2019). C-NMC 2019 Dataset: ALL Challenge dataset of ISBI 2019 [Dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.dc64i46r
Publication Citation
  1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis , vol. 65, Oct 2020. DOI: https://doi.org/10.1016/j.media.2020.101788.
  2. Shiv Gehlot, Anubha Gupta, and Ritu Gupta, “SDCT-AuxNetθ: DCT Augmented Stain Deconvolutional CNN with Auxiliary Classifier for Cancer Diagnosis,” Medical Image Analysis, Elsevier, vol. 61, pp. 1-15, April 2020, DOI: https://doi.org/10.1016/j.media.2020.101661.
  3. Shubham Goswami, Suril Mehta, Dhruv Sahrawat, Anubha Gupta and Ritu Gupta, “Heterogeneity Loss to Handle Intersubject and Intrasubject Variability in Cancer", ICLR workshop on Affordable AI in healthcare, 2020. arXiv preprint arXiv:2003.03295.
  4. Rahul Duggal, Anubha Gupta, Ritu Gupta, and Pramit Mallick, "SD-Layer: Stain Deconvolutional Layer for CNNs in Medical Microscopic Imaging," In: Descoteaux M., Maier-Hein L., Franz A., Jannin P., Collins D., Duchesne S. (eds) Medical Image Computing and Computer-Assisted Intervention − MICCAI 2017, MICCAI 2017. Lecture Notes in Computer Science, Part III, LNCS 10435, pp. 435–443. Springer, Cham. DOI: https://doi.org/10.1007/978-3-319-66179-7_50.
  5. Rahul Duggal, Anubha Gupta, Ritu Gupta, Manya Wadhwa, and Chirag Ahuja, “Overlapping Cell Nuclei Segmentation in Microscopic Images UsingDeep Belief Networks,” Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), India, December 2016.
TCIA Citation
  1. Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: https://doi.org/10.1007/s10278-013-9622-7.

MiMM_SBILab Dataset: Microscopic Images of Multiple Myeloma Cancer

Data Description
The dataset contains Microscopic images of Multiple Myeloma. The images were captured from bone marrow aspirate slides of patients diagnosed with multiple myeloma as per the standard guidelines. Detailed description and usage guidelines of data are available at TCIA link below:
Access Data Here
The dataset is publicly available at The Cancer Imaging Archive (TCIA). Please visit https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=52756988 to download the dataset.
Citations & Data Usage Policy
Please acknowledge both this data set and TCIA in publications by including the following citations in your work:
Data Citation
  1. Ritu Gupta and Anubha Gupta (2019). MiMM_SBILab Dataset: Microscopic Images of Multiple Myeloma [Dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.pnn6aypl
Publication Citation
  1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis, vol. 65, Oct 2020. DOI: https://doi.org/10.1016/j.media.2020.101788.
  2. Anubha Gupta, Pramit Mallick, Ojaswa Sharma, Ritu Gupta, and Rahul Duggal, "PCSeg: Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma," PLoS ONE 13(12): e0207908, Dec 2018, DOI: https://doi.org/10.1371/journal.pone.0207908.
  3. Ritu Gupta, Pramit Mallick, Rahul Duggal, Anubha Gupta, and Ojaswa Sharma, "Stain Color Normalization and Segmentation of Plasma Cells in Microscopic Images as a Prelude to Development of Computer Assisted Automated Disease Diagnostic Tool in Multiple Myeloma," 16th International Myeloma Workshop (IMW), India, March 2017
TCIA Citation
  1. Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: https://doi.org/1007/s10278-013-9622-7..

SN-AM Dataset: White Blood cancer dataset of B-ALL and MM for stain normalization

Data Description
This is a White Blood cancer dataset of B-lineage Acute Lymphoid Leukemia (B-ALL) and Multiple Myeloma (MM) for stain normalization. Microscopic images were captured from bone marrow aspirate slides of patients diagnosed with B-ALL and MM as per the standard guidelines. Detailed description and usage guidelines of data are available at TCIA link below:
Access Data Here
The dataset is publicly available at The Cancer Imaging Archive (TCIA). Please visit https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=52757009 to download the dataset.
Citations & Data Usage Policy
Please acknowledge both this data set and TCIA in publications by including the following citations in your work:
Data Citation
  1. Anubha Gupta and Ritu Gupta (2019). SN-AM Dataset: White Blood Cancer Dataset of B-ALL and MM for Stain Normalization [Dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/tcia.2019.of2w8lxr
Publication Citation
  1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis, vol. 65, Oct 2020. DOI: https://doi.org/10.1016/j.media.2020.101788.
  2. Ritu Gupta, Pramit Mallick, Rahul Duggal, Anubha Gupta, and Ojaswa Sharma, "Stain Color Normalization and Segmentation of Plasma Cells in Microscopic Images as a Prelude to Development of Computer Assisted Automated Disease Diagnostic Tool in Multiple Myeloma," 16th International Myeloma Workshop (IMW), India, March 2017.
  3. Rahul Duggal, Anubha Gupta, Ritu Gupta, and Pramit Mallick, "SD-Layer: Stain Deconvolutional Layer for CNNs in Medical Microscopic Imaging," In: Descoteaux M., Maier- Hein L., Franz A., Jannin P., Collins D., Duchesne S. (eds) Medical Image Computing and Computer-Assisted Intervention − MICCAI 2017, MICCAI 2017. Lecture Notes in Computer Science, Part III, LNCS 10435, pp. 435–443. Springer, Cham. DOI: https://doi.org/10.1007/978- 3-319-66179-7_50.
  4. Rahul Duggal, Anubha Gupta, Ritu Gupta, Manya Wadhwa, and Chirag Ahuja, “Overlapping Cell Nuclei Segmentation in Microscopic Images UsingDeep Belief Networks,” Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), India, December 2016.
  5. Rahul Duggal, Anubha Gupta, and Ritu Gupta, “Segmentation of overlapping/touching white blood cell nuclei using artificial neural networks,” CME Series on Hemato-Oncopathology, All India Institute of Medical Sciences (AIIMS), New Delhi, India, July 2016.
TCIA Citation
  1. Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: 10.1007/s10278-013-9622-7

Visitor No. website counter