*********************************************** ***OncoMX readme DATA:1.0.25 *** ***Production release *** ***Release date: April 22, 2020 *** *********************************************** OVERVIEW OncoMX is a knowledgebase of unified cancer biomarker evidence data from integrated genomics studies, including mutation, expression, literature, and biomarker datasets, searchable and accessible through web portal. FOR USE PLEASE CITE: Hayley M. Dingerdissen, Frederic Bastian, K. Vijay-Shanker, Marc Robinson-Rechavi, Amanda Bell, Nikhita Gogate, Samir Gupta, Evan Holmes, Robel Kahsay, Jonathon Keeney, Heather Kincaid, Charles Hadley King, David Liu, Daniel J. Crichton, and Raja Mazumder. OncoMX: A Knowledgebase for Exploring Cancer Biomarkers in the Context of Related Cancer and Healthy Data. JCO Clin Cancer Inform. 2020 :4, 210-220. PMID: 32142370. https://ascopubs.org/doi/full/10.1200/CCI.19.00117 DATA SOURCES AND FLOW The core underlying knowledgebase of OncoMX is derived from BioMuta and BioXpress integrated cancer mutation and expression databases. Healthy expression data from Bgee and custom text mining software, DiMeX and DEXTER, augment the cancer data to improve functional interpretation of the reported variants and expression profiles. Biomarker data are retrieved and compiled from EDRN and FDA. Where relevant, data are mapped to Disease Ontology and Uberon Anatomical Entity ontology terms to facilitate better integration. All data are wrapped into the OncoMX database and web portal, mapped to additional functional information from Reactome, and linked to other resources containing relevant evidence for reported and potential biomarkers. Cancer mutation and expression data are taken from: CIViC, ClinVar, COSMIC, ICGC, IntOGen, and TCGA. For more information regarding pipelines of contributing resources, please refer to the following links and pre-processing details described below: - Bgee https://bgee.org/ - BioMuta https://hive.biochemistry.gwu.edu/biomuta; https://academic.oup.com/nar/article/46/D1/D1128/4372542 - BioXpress https://hive.biochemistry.gwu.edu/bioxpress; https://academic.oup.com/nar/article/46/D1/D1128/4372542 - DEXTER https://academic.oup.com/database/article/doi/10.1093/database/bay045/5025486 - DiMeX http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0152725 - EDRN https://edrn.nci.nih.gov/ - Reactome https://reactome.org/ EXTERNAL PRE-PROCESSING OF SOURCE DATA All datasets are coming from third party collaborators or other data providers, and are therefore processed in part prior to integration in OncoMX. Brief pre-processing summaries for these source data can be found below or in the linked source documentation. - BioMuta - BioMuta contains nonsynonymous single-nucleotide variations (nsSNVs) found in cancer, integrated and unified from multiple resources and annotated with functional information and literature evidence. Data is processed as follows: - Data were retrieved from CIViC, ClinVar, COSMIC, ICGC, and TCGA and filtered for various criteria ensuring the viability of the reported genomic content and positional information. - All cancer terms were mapped to the set of CDO slim terms. - A unified mutation list was created in the form of a .csv containing the remaining entries that successfully passed filters across all resources. - Sanity checks were performed for each CDS sequence extracted from the Ensembl release 75 protein-coding transcript and translated peptides. - Genomic positions of nsSNVs were mapped to a position within the CDS and amino acid changes in the peptide sequence were mapped based on corresponding codon positions and changes. - If Ensembl peptide and either UniProtKB or RefSeq isoform sequences for a given nsSNV reported were not identical, pairwise global alignment was performed to identify the correct amino acid position in the canonical UniProtKB Ac and RefSeq protein sequences, respectively. - Site-specific annotations were retrieved from UniProtKB, and PolyPhen-2 prediction of functional effects of human nsSNPs and NetNGlyc functional prediction algorithms were run on the set of altered amino acid sequences surrounding each mutation to generate predicted effects of the nsSNV with respect to protein structure and function and post-translational protein glycosylation. Resulting data were loaded into MySQL relational database and dockerized (available at https://cloud.docker.com under the mazumderlab/biomuta repository), and made searchable through the BioMuta interface. The bulk dump of these results were downloaded and form the basis of the cancer mutation component of the OncoMX knowledgebase. Website URL: https://hive.biochemistry.gwu.edu/biomuta Primary citation(s): 1. Dingerdissen HM, Torcivia-Rodriguez J, Hu Y, Chang TC, Mazumder R, Kahsay R. BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery. Nucleic Acids Research, gkx907. 2017 Oct 09. PMCID: PMC5753215 2. Wu TJ, Shamsaddini A, Pan Y, Smith K, Crichton DJ, Simonyan V and Mazumder R. A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE). Database (Oxford):bau022. 2014 Mar 25. PMID: 24667251 Relevant web documentation: https://hive.biochemistry.gwu.edu/biomuta/readme - BioXpress - BioXpress is a knowledgebase containing gene and miRNA expression levels and changes reported from paired tumor and adjacent normal tissues from cancer samples. - Gene and miRNA raw read counts were quantified with reference to GRCh37, retrieved for all available TCGA studies and filtered for those studies with at least 10 patient samples. - All cancer terms were mapped to the set of CDO slim terms. - DESeq2 differential expression and analysis (including an inherent normalization step built into the R package) were run on each study to report a pooled value for the log2FC of each gene/miRNA and a corresponding p-value to show the significance of the reported expression change at the study level. - Adjusted p-values were used to determine significance of reported trends. - Results were summarized with respect to (1) all cancer types with a reported differential expression finding for a given gene and (2) all genes with a reported differential expression finding for a given cancer (for both significant and insignificant findings as long as DESeq2 reported a log2FC for a given gene). Resulting data were loaded into MySQL relational database and made available for search and exploration through the BioXpress interface. The bulk dump of these results were downloaded and form the basis of the cancer differential expression component of the OncoMX knowledgebase. Website URL: https://hive.biochemistry.gwu.edu/bioxpress Primary citation(s): 1. Dingerdissen HM, Torcivia-Rodriguez J, Hu Y, Chang TC, Mazumder R, Kahsay R. BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery. Nucleic Acids Research, gkx907. 2017 Oct 09. PMCID: PMC5753215. 2. Wan Q, Dingerdissen H, Fan Y, Gulzar N, Pan Y, Wu T-J, Yang C, Zhang H, and Mazumder R. BioXpress: An integrated RNA-seq derived gene expression database for pan-cancer analysis. Database (Oxford). 2015 Mar 28. pii: bav019. PMID: 25819073. Relevant web documentation: https://hive.biochemistry.gwu.edu/bioxpress/readme - Bgee - Bgee integrates and analyzes gene expression patterns from a variety of experimentally generated data types, including microarray, in situ hybridization, EST, and RNA-Seq. Only the subset of RNA-Seq data are used to generate the expression profiles for healthy individuals for human and mouse used by OncoMX. Details on initial data integration and processing of expression profiles can be found at the links for the relevant web documentation listed below. From this data, a custom format of healthy data was generated for human and mouse, containing the following information: Ensembl gene ID and UniProtKB accessions, Uberon anatomical entity IDs and names, Uberon developmental stage IDs and names, qualitative (high, medium, low, absent) reported expression levels for a queried gene with respect to all genes in a given tissue, similarly qualitative reported expression levels for a queried gene with respect to that same gene's expression across all tissues, the quality associated with the call, and a quantitative expression score based on ranks. Website URL: https://bgee.org/ Primary citation(s): Bastian, F. et al. Bgee: Integrating and Comparing Heterogeneous Transcriptome Data Among Species. Data Integration in the Life Sciences. 2008. (Springer Berlin Heidelberg) 124-131. Relevant web documentation: https://bgee.org/?page=doc&action=data_sets, https://bgee.org/?page=doc&action=call_files#single_expr - EDRN - EDRN compiles biomarker data based on literature reports and ongoing studies from within EDRN and other sources, curated by subject matter experts. Data was retrieved from the EDRN open web portal (see URL below). Website URL: https://edrn.nci.nih.gov/ Primary citation(s): Crichton DJ1, Mattmann CA, Thornquist M, Anton K, Hughes JS Bioinformatics: biomarkers of early detection. Cancer Biomark. 2010;9(1-6):511-30. doi: 10.3233/CBM-2011-0180. PMID: 22112493 Relevant web documentation: https://edrn.nci.nih.gov/about-edrn/manual-of-operations-version-5.0 - FDA Approved Biomarkers - This dataset lists FDA approved cancer biomarker tests are mined from web resources. - FDA approved cancer biomarker tests were unified through the use of ontology and controlled vocabularies, and transformed in a simple-to-use format. - Each row in the data table was generated to represent one gene linked to its respective test. - Genes were labeled by relevant identifiers/accessions from UniProtKB, HGNC, and EDRN. Tests were distinguished by manufacturer, FDA submission ID(s), clinical trial ID(s), and PubMed ID(s). FDA-approved tests were downloaded as a list of FDA-approved or cleared nucleic acid based tests from https://www.fda.gov/medical-devices/vitro-diagnostics/nucleic-acid-based-tests. - Fields were created to describe test information and facilitate mapping to other relevant annotations. The full set of identified headers included: uniprotkb_ac, test_disease_use, test_trade_name,test_manufacturer, test_submission, test_is_panel, gene_symbol, biomarker_id, biomarker_origin, ncit_biomarker, do_name, doid, histological_type, approved_indication, actual_use, specimen_type, method, test_number_genes, test_adoption_evidence, test_clin_trial_id, test_reg_approval_status, pmid, test_study_design, clinical_significance, drug, and biomarker_description. - Data retrieved from the FDA website were used to populate the corresponding fields, and additional accessions, terms, and values were mapped from external resources and ontologies (including UniProtKB [https://www.uniprot.org/], FDA terms [https://www.fda.gov/medical-devices/device-advice-comprehensive-regulatory-assistance/overview-device-regulation], NCIt [https://ncit.nci.nih.gov/ncitbrowser/], SEER [https://training.seer.cancer.gov/disease/categories/tissues.html], HGNC [https://www.genenames.org], and Disease Ontology [http://www.disease-ontology.org/]. - EDRN biomarker IDs were mapped as a final xref. - Quality assurance and quality control measures were applied, including verification of correct tabular parsing, validation of field name formats based on group implemented ontology standards, checksum integrity of the resulting file, and manual spot-checking for grammar or encoding issues. Website URL: https://data.oncomx.org/ONCOMXDS000003 (Please note: this dataset was generated within the scope of the project, the OncoMX data portal is the primary repository for this dataset.) Relevant web documentation: https://www.oncomx.org/static/docs/oncomx_readme.txt - Cell expression specificity - This dataset reports the specificity for gene expression level across with respect to cell type as analyzed from cancer data. - RangedSummarizedExperiment R objects that contained RNA-seq gene-level count tables and sample phenotype data were retrieved from recount23 in August 2019 for the following SRA study accessions: SRP059035, SRP030617 and SRP042161. Colon cancer data consists of cell transcriptomes from one cancer cell line, HCT116. Lung cancer data consists of cell transcriptomes from the patient derived xenograft cell-types (PDX) LC-PT-45, LC-PT-45-Re, and LC-MBT-15, in addition to the cell line H358. Brain cancer data consists of cell transcriptomes from the primary tumor cell-types MGH-26, MGH-28, MGH-29, MGH-30, and MGH-31. - Retrieved datasets were filtered for low quality cells and low-abundance genes, and cell-specific biases between samples within each dataset are normalized with a deconvolution approach4. - Dimensionality reduction and visualization was performed with principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP) to verify cell type annotations from the metadata of retrieved datasets. - Differential expression was estimated using preferential expression measure (PEM) and using DESeq2 from the Singe Cell Toolkit (SCTK)6. - Analysis of the gene-level count matrices with PEM provided a score for each gene in each cell type which were subsequently assigned qualitative annotations of for expression specificity, and DESeq2 analyzed between-cell-type differences. The colon cancer dataset featured only one cell type and was consequently excluded from DESeq2 analysis. Website URL: https://data.oncomx.org/ONCOMXDS000015 Relevant web documentation: https://www.oncomx.org/static/docs/oncomx_readme.txt HOW TO USE OncoMX can be used to search by data source/type or can be explored through user perspectives, accessed from the dashboard, as described below. *********************************************** *********************************************** Explore datasets From the landing page, click on one of the linked dataset buttons: Differential expression, Biomarkers, Pathways, Disease mutation, Normal expression, Expression literature mining, or Mutation literature mining. (Please note - clicking Data sources will redirect the user to the original tables retrieved from each of the contributing resources, whereas clicking Browse datasets will redirect the user to the data portal where processed datasets and provenance details can be explored.) All OncoMX tables can be searched by a text string using the search box in the upper right corner. Searching for a string in this way will filter the rendered rows to display only those containing the queried string. Visible data (the first 20 rows, by default) can be copied to your clipboard, downloaded as a .csv, or printed using the buttons immediately above the table in the left corner. Hits can be navigated using the pagination tools below the table to the right. *********************************************** Differential expression This table is indexed by gene/protein/mRNA or miRNA accession/name. In addition to text searching, you can use the filters on the left hand side of the table to filter rows by TCGA cancer types or by significance of differential expression. To access the gene detail view for a specific entry, click the hyperlinked gene symbol in the first column. Cross-references to numerous other resources can also be found in the table, including NCBI gene search, BioXpress entry, disease ontology, TCGA study, and Uberon ontology. Columns for this table are as follows: COLUMN HEADER DESCRIPTION Gene/miRNA Official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name, hyperlinked to the OncoMX detail view for this entry; if miRNA, this column will report the accession in the miRBase database UniProtKB/SwissProt AC Hyperlinked UniProtKB accession (accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database) linking to relevant entry in BioXpress P-value P-value associated with reported differential expression Adj. P-value P-value associated with reported differential expression adjusted for multiple testing Cancer Type Mapped Cancer Disease Ontology slim term and link to relevant disease ontology page Log2 F.C. Log2 fold change of expression of the gene/miRNA in tumor as compared to adjacent normal Patient Freq. Proportion of patients whose individual observed trend of expression matches the cancer-wise trend RefSeq AC RefSeq accession associated with the transcript Significant Reports whether the observed change in expression was determined to be significant (adjpval < 0.05) TCGA Cancer TCGA cancer acronym linked to the corresponding TCGA study page in NIH-NCI GDC Data Portal Anatomical Entity ID Uberon anatomical entity ID corresponding to the diseased tissue *********************************************** Biomarkers This table is indexed by gene symbol or panel name. Columns for this table are as follows: COLUMN HEADER DESCRIPTION Gene Symbol/Panel Official gene symbol or panel name, hyperlinked to the OncoMX detail view for this entry Type Type of biomarker (can be Epigenetic, Gene, Genomic, Protein, or Proteomic) Associated Dataset Dataset(s) associated with the biomarker as reported in EDRN database Is Panel Denotes if entry is a biomarker or a panel Phase Please see EDRN documentation QA State Denotes whether biomarker has been curated, accepted, or is under review Organ Organ in which biomarker is applicable (current options include Colon, Lung, Breast, and Ovary) HGNC Symbol Official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name Reference Resource Hyperlinked references reporting biomarker activity UniProtKB/SwissProt AC Accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database (not applicable to panels) *********************************************** Pathways This table is indexed by UniProtKB/SwissProt AC and reports events associated with a given protein, its evidence, and the pathway to which the event belongs. Columns for this table are as follows: COLUMN HEADER DESCRIPTION UniProtKB/SwissProt AC Accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database hyperlinked to UniProtKB Gene Symbol Official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name, hyperlinked to the OncoMX detail view for this entry Event Name of pathway event Evidence Code Evidence code for gene/protein participation in the event (can be IEA or TAS) Reactome Pathway ID Hyperlinked Reactome IDs for the corresponding pathway *********************************************** Disease mutation This table reports variants in cancer samples for each relevant in genomic and proteomic coordinates. Columns for this table are as follows: COLUMN HEADER DESCRIPTION Gene Symbol Official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name, hyperlinked to the OncoMX detail view for this entry UniProtKB/SwissProt AC Hyperlinked UniProtKB accession (accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database) linking to relevant entry in BioMuta RefSeq AC RefSeq accession associated with the canonical transcript Cancer Type Type of cancer associated with reported variant Functional Impact Denotes whether a reported variant is associated with a functional loss or gain of acetylation, phosphorylation, glycosylation, or other functional annotation from UniProtKB, or prediction from PolyPhen or NetNGlyc2.0 Genome Position Genomic position of the variant Nuc. Position Position of variant in nucleic acid sequence Ref. Nuc. Reference or wild-type nucleotide base Var. Nuc. Nucleotide base resulting from variation AA Position Position of variation in protein sequence Ref. AA Reference or wild-type amino acid residue Var. AA Amino acid residue resulting from variation Polyphen2 If applicable, lists the predicted effect of the variant reported by PolyPhen-2 (benign, possibly damaging, or probably damaging) PMID If available, PMID(s) of manually curated or semi-automatically mined (using DiMeX) publication(s) associated with the reported variation Source Data source of reported variation (can be CIViC, ClinVar, COSMIC, ICGC, or TCGA) Status Status of study from which variation was obtained (LG for large-scale, SM for small-scale) Anatomical Entity ID Uberon anatomical entity ID corresponding to the diseased tissue *********************************************** Normal expression This table reports the status of RNA-seq derived expression in normal samples. Columns for this table are as follows: COLUMN HEADER DESCRIPTION Gene Symbol Official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name, hyperlinked to the OncoMX detail view for this entry UniProtKB/SwissProt AC Accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database Ensembl Gene ID Hyperlinked Ensembl gene ID linking to relevant entry in Bgee Anatomical Entity Name Uberon anatomical entity name corresponding to sample tissue Developmental Stage Name Uberon developmental stage name corresponding to sample tissue Expression Call Indicates presence or absence of expression Expression Rank The lower the rank, the higher the expression level Call Quality Quality associated with call (can be high quality or poor quality) *********************************************** Expression literature mining This table displays hits for literature mined mentions of expression in cancer using a customized application of DEXTER. Columns for this table are as follows: COLUMN HEADER DESCRIPTION UniProtKB/SwissProt AC Accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database Entrez ID Unique, stable, and tracked integer identifier Gene Mention Specific form/spelling of gene mentioned in the retrieved publication PMID PMID(s) of mined (using DEXTER) publication(s) containing expression information DOID Mapped Cancer Disease Ontology slim ID DOID Name Mapped Cancer Disease Ontology slim term Disease Mention Specific form/spelling of disease mentioned in the retrieved publication Disease Extracted From Section of the publication from which the disease mention was extracted Expression Level Reports whether the expression change was reported to go up or down in disease Sentence Type Type of sentence, strength of assertion is strongest in TypeA Sample 1 Denotes the first (or only) group compared in the extracted sentence Sample 2 If the extracted sentence contains a comparison between two groups, denotes the second group compared in the extracted sentence Is Same Patient Reports whether a comparison is performed between two groups of samples from a single patient Sentence Text Text of sentence extracted *********************************************** Mutation literature mining This table displays hits for literature mined mentions of mutation in cancer using a customized application of DiMeX. Columns for this table are as follows: COLUMN HEADER DESCRIPTION PMID PMID(s) of mined (using DiMeX) publication(s) containing variant information UniProtKB/SwissProt AC Accession assigned to the protein isoform chosen to be the canonical sequence in UniProtKB database Gene Symbol Official gene symbol approved by the HGNC, which is a short abbreviated form of the gene name Entrez ID Unique, stable, and tracked integer identifier Gene Mention Specific form/spelling of gene mentioned in the retrieved publication DOID Mapped Cancer Disease Ontology slim ID DOID Name Mapped Cancer Disease Ontology slim term Disease Mention Specific form/spelling of disease mentioned in the retrieved publication Mutation Mention Specific form of substitution mutation mentioned in the retrieved publication Mutation Type Specifies whether the mutation is mentioned in terms of amino acid or nucleotide substitutions and coordinates Abstract Extraction Section of abstract from which the relevant sentence was extracted Sentence Number Number of sentence extracted Sentence Text Text of sentence extracted Extraction Method Denotes whether relationship in sentence was an association or other Patient/Control Numbers If available, reports the number of patients and/or controls in the study Is Meta-Analysis Denotes if the publication was a meta-analysis Is Review Denotes if the publication was a review *********************************************** From each of the above tables, a user can click on the corresponding link in the Gene Symbol column to go to the gene-centric detail view. From this page, the user can see results from each of the various perspectives filtered for a specific gene. *********************************************** *********************************************** Search From the landing page, the user can also access the search bar. A search term can be entered including gene symbol, UniProtKB accession, or other string search term. Upon submitting the search, the user will be directed to the gene-centric detail view for biomarker evidence. The top half of this page contains a viewer for displaying visual summaries for each of the integrated evidence types, including Cancer Bulk RNA-seq, Human Normal Bulk RNA-seq, Mouse Normal Bulk RNA-seq, Mutation, Expression Literature Mining, Mutation literature mining, and Biomarkers. The bottom part of the page contains another viewer with text and tabular information for each of the contributing datasets. Biomarker This tab summarizes biomarker details from EDRN. Fields are as follows: FIELD LABEL DESCRIPTION EDRN Title Name given to biomarker in EDRN Organ Organ in which biomarker is applicable (current options include Colon, Lung, Breast, and Ovary) Phase If available, reports the designation of the biomarker as one of five phases (Phase 1, Preclinical Exploratory; Phase 2, Clinical Assay and Validation; Phase 3, Retrospective Longitudinal; Phase 4, Prospective Screening; Phase 5, Cancer Control) QA Denotes whether biomarker has been curated, accepted, or is under review Aliases Other names/descriptions of the biomarker Description States the purpose and scope of the biomarker *********************************************** BioMuta Please see Disease mutations table description above. *********************************************** BioXpress Please see Disease expression table description above. *********************************************** Bgee Please see Normal expression table description above. *********************************************** *********************************************** Dashboard An interactive dashboard can be accessed by scrolling down below the landing page. Four sections of interactive content can be accessed here: Perspectives, Statistics at a Glance, Data Sources, and News. *********************************************** Perspectives These views are currently under development. For now, clicking the links will redirect you to the relevant tables described in the search section above. Biomarkers will redirect to the Biomarkers table, Evolutionary Context to the normal expression table, Literature Mining to tables with information about literature mining in cancer, and Biomarkers within Pathways to the Pathways table. Please note that detailed views are being built for each of these perspectives. *********************************************** Statistics at a Glance Clicking on any of the topics listed here or interacting with the Circos plot will toggle the charts displayed below. Statistics can be viewed as a series of charts across multiple resources (Total Cancer Terms and Proteins) or for each primary contributing resource (Biomarkers from EDRN, BioMuta, and BioXpress). Additional summary views will be added in subsequent releases. *********************************************** Data Sources Clicking any of the five contributing sources will take you to the original table retrieved from that source. All tables can be interacted with and downloaded as described above for the Search section. *********************************************** News This content will be updated as relevant news is available. *********************************************** Cross-references 1. Bgee is a database to retrieve and compare gene expression patterns in multiple animal species, produced from multiple data types (RNA-Seq, Affymetrix, in situ hybridization, and EST data). Bgee is based exclusively on curated "normal", healthy, expression data (e.g., no gene knock-out, no treatment, no disease), to provide a comparable reference of normal gene expression. Access - https://bgee.org 2. CDSA is the Cancer Digital Slide Archive project that provides imaging data to the cancer community, and houses all of the TCGA pathology imaging data and PDFs of the Path reports. Access - https://cancer.digitalslidearchive.org/ 3. CDGnet is a tool for prioritizing targeted therapies based on an individual's tumor profile. Tumor molecular profiling refers to the use of a panel of genes and proteins that are assessed for potential abnormalities, including genetic changes and over/under expression of genes or proteins, in order to decide on targeted treatment plans. CDGnet incorporates information from biological networks relevant to the cancer type and to the specific alterations, FDA-approved targeted cancer therapies and indications, additional gene-drug information, and data on whether given genes are oncogenes. Access - http://epiviz.cbcb.umd.edu/shiny/CDGnet 4. CIViC is an open access, open source, community-driven web resource for Clinical Interpretation of Variants in Cancer. Our goal is to enable precision medicine by providing an educational forum for dissemination of knowledge and active discussion of the clinical significance of cancer genome alterations. For more details and to cite CIViC please refer to the CIViC publication (https://www.nature.com/articles/ng.3774) in Nature Genetics. Access - https://civicdb.org 5. EDRN is the Early Detective Research Network is involved in researching hundreds of biomarkers. The following is a partial list of biomarkers and associated results that are currently available for access and viewing. The bioinformatics team at EDRN is currently working with EDRN collaborative groups to capture, curate, review, and post the results as it is available. EDRN also provides secure access to additional biomarker information not available to the public that is currently under review by EDRN research groups. Access - https://edrn.nci.nih.gov/biomarkers 6. HemOnc is the largest freely available medical wiki of interventions, regimens, and general information relevant to the fields of hematology and oncology. It is designed for easy use and intended for healthcare professionals. Any healthcare professional can sign up to contribute; the accuracy and completeness of content is overseen by the Editorial Board. Heavily visited pages can be accessed directly from the menu on the left. If this is your first time visiting, please go to the tutorial page or just start exploring! Access - https://www.hemonc.org/wiki/Main_Page 7. iPTMnet - iPTMnet is a bioinformatics resource for integrated understanding of protein post-translational modifications (PTMs) in systems biology context. It connects multiple disparate bioinformatics tools and systems text mining, data mining, analysis and visualization tools, and databases and ontologies into an integrated cross-cutting research resource to address the knowledge gaps in exploring and discovering PTM networks. Access - https://research.bioinformatics.udel.edu/iptmnet/ 8. SingleCellTK - Interactive Analysis of Single Cell RNA-Seq data. Access - https://rdrr.io/bioc/singleCellTK/ 9. UniProt - https://www.uniprot.org/ *********************************************** External downloads All OncoMX data is stored and made available to the user through www.data.oncomx.org. Click on the any tab in this section to see the downloadable data. The next section is a table viewer which changes dynamically based on options selected above. Below that are details about the project and team, followed by the Contact form for relaying any concerns/feedback to the OncoMX Development Team. The top menu also contains links to many of the sections described above, as well as links to available help documentation, and a quick access search bar. All datasets available through OncoMX are licensed under a Creative Commons Attribution 4.0 International License. README UPDATED: June 10, 2020 *********************************************** ***OncoMX readme DATA:1.0.25 *** ***Production release *** ***Release date: April 22, 2020 *** ***********************************************