Public Maps
===========
.. _pancanatlas/samplemap:
PancanAtlas
-----------
As its concluding project, The Cancer Genome Atlas (TCGA) Research Network has
completed the most comprehensive cross-cancer analysis to date with analysis of
10,000 tumors from 33 types of cancer: The TCGA Pan-Cancer Atlas (PanCanAtlas).
This project aims to answer big, overarching questions about cancer by examining
the full set of tumors characterized in the robust TCGA dataset.
https://www.ncbi.nlm.nih.gov/m/pubmed/29625048/
.. _pancan12/samplemap:
.. _pancan12/genemap:
Pancan12
--------
The diverse tumor set called “Pan-Cancer-12,” is composed of 12 different
malignancies. It comprises 3,527 cases assayed by at least four of the six
possible data types routinely generated by The Cancer Genome Atlas: whole-exome
DNA sequence (Illumina HiSeq and GAII), DNA copy number variation (Affymetrix
6.0 microarrays), DNA methylation (Illumina 450,000-feature microarrays),
genome-wide mRNA levels (Illumina mRNA-seq), microRNA levels (Illumina
microRNA-seq), and protein levels for 131 proteins and/or phosphorylated
proteins (Reverse Phase Protein Arrays; RPPA).
The Sample Map layouts are composed of tumor samples collected in the study.
Attributes that can be explored on the Sample Map are described at
:ref:`pancan12/samplemapattributes`.
The Gene Map layouts are composed of genes mapping to a probe from the data
collection platforms. E.g. MYC on the mRNA Gene Map, corresponds to the probe
mapping to MYC on the micro array platform. Attributes that can be explored on
the Gene Map are described at
:ref:`pancan12/genemapattributes`.
Hoadley,K.A., Yau,C., Wolf,D.M., Cherniack,A.D., Tamborero,D., Ng,S.,
Leiserson,M.D.M., Niu,B., McLellan,M.D., Uzunangelov,V., et al. (2014)
`Multiplatform analysis of 12 cancer types reveals molecular classification
within and across tissues of origin
`_.
Cell, 158, 929–44.
.. _gliomas:
Gliomas
-------
We assembled a dataset comprising all TCGA newly diagnosed diffuse glioma
consisting of 1,122 patients and comprehensively analyzed using sequencing and
array-based molecular profiling approaches.
We extended our analysis using TumorMap to perform integrated co-clustering
analysis of the combined gene expression (n = 1,196) and DNA methylation (n =
867) profiles. Clusters in the map indicate groups of samples with high
similarity of integrated gene expression and DNA methylation profiles.
Ceccarelli,M., Barthel,F.P., Malta,T.M., Sabedot,T.S., Salama,S.R., Murray,B.A.,
Morozova,O., Newton,Y., Radenbaugh,A., Pagnotta,S.M., et al. (2016)
`Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of
Progression in Diffuse Glioma
`_.
Cell, 164, 550–63.
.. _quakebrain:
QuakeBrain
----------
We used single cell RNA sequencing on 466 cells to capture the cellular
complexity of the adult and fetal human brain at a whole transcriptome level.
Healthy adult temporal lobe tissue was obtained during surgical procedures where
otherwise normal tissue was removed to gain access to deeper hippocampal
pathology in patients with medical refractory seizures. We were able to classify
individual cells into all of the major neuronal, glial, and vascular cell types
in the brain. We were able to divide neurons into individual communities and
show that these communities preserve the categorization of interneuron subtypes
that is typically observed with the use of classic interneuron markers.
Darmanis,S., Sloan,S.A., Zhang,Y., Enge,M., Caneda,C., Shuer,L.M., Hayden
Gephart,M.G., Barres,B.A. and Quake,S.R. (2015)
`A survey of human brain transcriptome diversity at the single cell level
`_.
Proc. Natl. Acad. Sci. U. S. A., 112, 7285–90.
.. _pchips:
pCHIPS
------
The presented pChips data set is a subset of Pancan12 data supplemented by
clinical tissue from lethal metastatic castration-resistant prostate cancer
patients obtained at rapid autopsy.
Drake,J.M., Paull,E.O., Graham,N.A., Lee,J.K., Smith,B.A., Titz,B.,
Stoyanova,T., Faltermeier,C.M., Uzunangelov,V., Carlin,D.E., et al. (2016)
`Phosphoproteome Integration Reveals Patient-Specific Networks in Prostate
Cancer
`_.
Cell, 166, 1041–1054.
.. _pcawgjuncbase:
mgmarin_public/PCAWG_JuncBASE
-----------------------------
TBD
Attribute Descriptions
----------------------
.. _pancan12/samplemapattributes:
Pancan12/SampleMap Attributes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| Tissue
| BRCA Subtype
| COADREAD Subtype
| GBM Subtype
| OV subtype
| UCEC Subtype
| gender
| number_of_lymphnodes_positive
| colon_polyps_present
| microsatellite_instability
| metastasis_pathological_spread
| height
| weight
| age_at_initial_pathologic_diagnosis
| icd_10
| lymphovascular_invasion_present
| karnofsky_performance_score
| neoplasm_histologic_grade
| icd_o_3_site
| primary_tumor_t_stage
| lymphnode_pathologic_spread
| acute_myeloid_leukemia_calgb_cytogenetics_risk_category
| tumor_stage_and_substage
| neoplasm_disease_lymph_node_stage
| primary_tumor_pathologic_spread
| history_of_colon_polyps
| tumor_stage
| pancan subtype integrated
| pancan subtype methylation
| pancan subtype RPPA
| pancan subtype mRNA
| pancan subtype miRNA
| pancan subtype mutations
| Met vs Primary
| ..._MUTATION (313 mutation flags for high-confidence mutations, where * is a gene symbol in HUGO space)
| ..._AMPLIFICATION (999 gene-level or chromosomal amplification events)
| ..._DELETION (1987 gene-level or chromosomal deletion events)
| TF_IPL_* (774 transcription factors with their activities summarized in the PARADIGM IPL space per each sample; * is a gene symbol in HUGO space)
| * program (42 drug programs inferred from the gene expression data, where * is a molecular process or function name)
| Mutation Signature 1
| Mutation Signature 2
| Mutation Signature 3
| Mutation Signature 4
| Mutation Signature 5
| Mutation Signature 6
| Mutation Signature 7
| Mutation Signature 8
| Mutation Signature 10
| Mutation Signature 13
| Mutation Signature 14
| Mutation Signature 17
| Mutation Signature 20
| Mutation Signature 26
| Mutation Signature 27
.. _pancan12/genemapattributes:
Pancan12/GeneMap Attributes
^^^^^^^^^^^^^^^^^^^^^^^^^^^
=========================== ============== =============
Database name Number of sets Variable type
=========================== ============== =============
MSigDB Positional Gene Sets 326 Binary
MSigDB Hallmark gene sets 50 Binary
MSigDB Canonical gene sets 1330 Binary
GO:Biological Process 825 Binary
GO:Cellular Component 233 Binary
GO:Molecular Function 396 Binary
=========================== ============== =============