Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Predicting the impact of mutations using pathwayguided integrative genomics AACR Annual Meeting 2012 Major Symposium: Designing Rational Combination Therapy for Cancer Josh Stuart, UC Santa Cruz Apr 3, 2012 Disclosure SAB for Five3 Genomics Overview of pathway-guided approach Integrate many data sources to gain accurate view of how genes are functioning in pathways Predict the functional consequences of mutations by quantifying the effect on the surrounding pathway Use pathway signatures to implicate mutations in novel genes to (re-)focus targeting Identify critical “Achilles Heels” in the pathways that distinguish a particular sub-type Flood of Data Analysis Challenges Genomics, Functional Genomics, Metabolomics, Epigenomics = Structural Variation This is What it Copy Number Does to You Alterations Exome Sequences Multiple, Possibly Conflicting Signals Expression DNA Methylation Analysis of disease samples like automotive repair (or detective work or other sleuthing) Patient Sample 1 Patient Sample 2 Sleuths use as much knowledge as possible. Patient Sample 3 Patient Sample N … Much Cell Machinery Known: Gene circuitry now available. Curated and/or Collected Reactome KEGG Biocarta NCI-PID Pathway Commons … Integration key to interpret gene function Expression not always an indicator of activity Downstream effects often provide clues Expression of 3 transcription factors: high TF high TF low TF Inference: TF is ON Inference: TF is OFF Inference: TF is ON (expression reflects activity) (high expression but inactive) (low-expression but active ) Integration key to interpret gene function Need multiple data modalities to get it right. BUT, targets are amplified Expression -> TF ON TF Copy Number -> TF OFF Lowers our belief in active TF because explained away by cis evidence. Probabilistic Graphical Models: A Language for Integrative Genomics Nir Friedman, Science (2004) - Review Generalize HMMs, Kalman Filters, Regression, Boolean Nets, etc. Language of probability ties together multiple aspects of gene function & regulation Enable data-driven discovery of biological mechanisms Foundation: J. Pearl, D. Heckerman, E. Horvitz, G. Cooper, R. Schacter, D. Koller, N. Friedman, M. Jordan, … Bioinformatics: D. Pe’er, A. Hartemink, E. Segal, E Schadt… Integration Approach: Detailed models of gene expression and interaction MDM2 TP53 Integration Approach: Detailed models of expression and interaction Two Parts: MDM2 TP53 1. Gene Level Model (central dogma) 2. Interaction Model (regulation) PARDIGM Gene Model to Integrate Data 1. Central Dogma-Like Gene Model of Activity 2. Interactions that connect to specific points in gene regulation map Vaske et al. 2010. Bioinformatics Charlie Vaske Steve Benz Integrated Pathway Analysis for Cancer Cohort Multimodal Data Pathway Model of Cancer Inferred Activities CNV mRNA meth … Integrated dataset for downstream analysis Inferred activities reflect neighborhood of influence around a gene. Can boost signal for survival analysis and mutation impact TCGA Ovarian Cancer Inferred Pathway Activities Pathway Concepts (867) Patient Samples (247) TCGA Network. 2011. Nature (lead by Paul Spellman) Ovarian: FOXM1 pathway altered in majority of serous ovarian tumors Patient Samples (247) Pathway Concepts (867) FOXM1 Transcription Network TCGA Network. 2011. Nature (lead by Paul Spellman) FOXM1 central to cross-talk between DNA repair and cell proliferation in Ovarian Cancer TCGA Network. 2011. Nature (lead by Paul Spellman) PATHMARK: Identify Pathway-based “markers” that underlie sub-types Identify sub-pathways that distinguish Insight from contrast patients sub-types (e.g. mutant vs. non-mutant, response to drug, etc) Predict mutation impact on pathway “neighborhood” Identify master control points for drug targeting. Predict outcomes with quantitative simulations. Sam Ng Ted Goldstein Pathway signatures of mutations to reveal therapeutic candidates Mutated genes are the focus of many targeted approaches. Some patients with “right” mutation don’t respond. Why? Many cancers have one of several “novel” mutations. Can these be targeted with current approaches? Pathway-motivated approaches: Identify gain-of-function from loss-of-function. Compare novel signatures Sam Ng Poster # 2985 PARADIGM-Shift: Pathway context of GOF and LOF events Use pathways to predict the impact of observed mutations in patient tumors High Inferred Activity Low Inferred Activity Predicted Loss-Of-Function Predicted Gain-Of-Function FG FG Sam Ng High Inferred Activity PARADIGM-Shift Predicting the Impact of Mutations On Genetic Pathways Inference using all neighbors FG mutated gene Inference using downstream neighbors FG Inference using upstream neighbors SHIFT FG Low Inferred Activity Sam Ng PARADIGM-Shift Calculation Overview FG Sam Ng PARADIGM-Shift Calculation Overview FG 1. Identify Local FG Neighborhood Sam Ng PARADIGM-Shift Calculation Overview 2a. Regulators Run FG 1. Identify Local Neighborhood FG FG 2b. Targets FG Run Sam Ng PARADIGM-Shift Calculation Overview 2a. Regulators Run FG 1. Identify Local Neighborhood FG 3. Calculate FG FG 2b. Targets Run FG - P-Shift Score FG Difference FG (LOF) Sam Ng P-Shift Predicts RB1 Loss-of-Function in GBM Shift Score PARADIGM downstream PARADIGM upstream Expression Mutation RB1 Sam Ng RB1 Network (GBM) Focus Gene Key P-Shift T-Run R-Run Expression Mutation Neighbor Gene Key Activity Expression RB1 Mutation Sam Ng RB1 Discrepancy Scores distinguish mutated vs non-mutated samples Signal Score (t-statistic) = -5.78 Sam Ng RB1 discrepancy distinction is significant Given the same network topology, how likely would we call a gain/loss of function Background model: permute gene labels in our dataset Compare observed signal score to signal scores (SS) obtained from background model Observed SS Background SS Sam Ng TP53 Network Sam Ng Gain-of-Function (LUSC) P-Shift Score PARADIGM downstream PARADIGM upstream Expression Mutation NFE2L2 Sam Ng NFE2L2 Network (LUSC) Focus Gene Key P-Shift T-Run R-Run Expression Mutation Neighbor Gene Key Activity Expression RB1 Mutation NFE2L2 Sam Ng Discrepancy scores are sensitive RB1 Signal Score (t-statistic) = -5.78 TP53 Signal Score (t-statistic) = -10.94 NFE2L2 Signal Score (t-statistic) = 4.985 Observed SS Background SS Sam Ng Expect passenger mutations to lack shifts Is the discrepancy specific? Negative control: calculate scores for “passenger” mutations Passengers: insignificant by MutSig (p > 0.10) well-represented in our pathways Discrepancy of these “neutral” mutations should be close to what’s expected by chance (from permuted) Sam Ng Discrepancies of Passenger Mutations are NOT distinctive Sam Ng Pathway Discrepancy LUSC PARADIGM-Shift gives orthogonal view of the importance of mutations in LUSC HIF3A (n=7) TBC1D4 (n=9) (AKT signaling) NFE2L2 (29) MAP2K6 (n=5) MET (n=7) (gefitinib resistance) GLI2 (n=10) (SHH signaling) CDKN2A (n=30) EIF4G1 (n=20) AR (n=8) Enables probing into infrequent events Can detect non-coding mutation impact (pseudo FPs) Can detect presence of pathway compensation for those seemingly functional mutations (pseudo FPs) Extend beyond mutations Limited to genes w/ pathway representation Sam Ng Defining Pathway Signatures for Mutations and Sub-Types Build a signature for every mutation and tumor/clinical event. Correlate every signature to each other. Reveals common molecular similarities between different divisions of patient subgroups Mutations in novel genes may “phenocopy” mutations in known genes Ted Goldstein PathMark: Differential Subnetworks from a “SuperPathway” Pathway Activities Pathway Activities Ted Goldstein Sam Ng PathMark: Differential Subnetworks from a “SuperPathway” SuperPathway Activities SuperPathway Activities Pathway Signature Ted Goldstein Sam Ng PIL: Pathway-informed Learning Traditional methods treat each gene as a separate feature Use features reflecting overall pathway activity Smaller number of features are now fed to predictors Predictor Artem Sokolov PIL: Pathway-informed Learning Traditional methods treat each gene as a separate feature Use features reflecting overall pathway activity Smaller number of features are now fed to predictors Predictor Artem Sokolov Basal vs. Luminal Recursive feature elimination: we train an SVM, drop the least important half of features and recurse The number of times each feature survived the elimination across 100 random splits of data Artem Sokolov Methotrexate Sensitivity Non sub-type specific drug Pathway involving the target of the drug. Artem Sokolov Triple Negative Breast Pathway Markers Identified from 50 Cell Lines 980 pathway concepts 1048 interactions One large highly-connected component (size and connectivity significant according to permutation analysis) Characterized by several “hubs’ IL23/JAK2/TYK2 P53 tetramer HIF1A/ARNT ER FOXA1 Myc/Max Higher activity in ERLower activity in ER- Sam Ng, Ted Goldstein Identify master controllers using SPIA (signaling pathway impact analysis) Google PageRank for Networks Determines affect of a given pathway on each node Calculates perturbation factor for each node in the network Takes into account regulatory logic of interactions. n Impact factor: IF ( gi ) = s ( gi ) + å bij × j=1 IF ( g j ) N up ( g j ) Google’s PageRank Yulia Newton Slight Trick: Run SPIA in reverse Reverse edges in Super Pathway High scoring genes now those at the “top” of the pathway PageRank finds highly referenced Reverse to find Highly referencing Yulia Newton Master Controller Analysis on Breast Cell Lines Basal Luminal Yulia Newton Master regulators predict response to drugs: PLK3 predicted as a target for basal breast • DNA damage network is upregulated in basal breast cancers • Basal breast cancers are sensitive to PLK inhibitors GSK-PLKi Luminal Claudin-low Basal Ng, Goldstein Heiser et al. 2011 PNAS Up Down HDAC inhibitors predicted for luminal breast • HDAC Network is downregulated in basal breast cancer cell lines • Basal/CL breast cancers are resistant to HDAC inhibitors HDAC inhibitor VORINOSTAT Heiser et al. 2011 PNAS Ng, Goldstein PARADIGM in TCGA patient BRCA tumors Christina Yau, Buck Inst Christina Yau, Buck Inst. Connect genomic alterations to downstream expression/activity ? • What circuitry connects mutations to transcriptional changes? Mutations general (epi-) genomic perturbation – Expression activity – • Mutation/perturbation and expression/activity treated as heat diffusing on a network HotNet, Vandin F, Upfal E, B.J. Raphael, 2008. – HotNet used in ovarian to implicate Notch pathway – • Find subnetworks that link genetic to mRNA and protein-level changes. Evan Paull HotLink Gene Activity (Expression, RPPA, PARADIGM) Genomic Perturbations (Mutations, Methylation, Focal Copy Number) ? Evan Paull HotLink Genomic Perturbations (Mutations, Methylation, Focal Copy Number) 1. Add heat 2. Diffuse heat Gene Activity (Expression, RPPA, PARADIGM) 3. Cut out linkers Evan Paull HotLink Genomic Perturbations (Mutations, Methylation, Focal Copy Number) Gene Activity (Expression, RPPA, PARADIGM) Linking Sub-Pathway Evan Paull TCGA Interlinking Network Basal-LumA HotLink Basal LumA MAP, PI3K, AKT Basal-LumA HotLink Map Basal LumA TP53, RB1 MYC, FOXM1 AKT/PI3K MAPK8, MAPK14 (p38alpha) identified as mediators. TP53, RB1 Basal LumA MYC Neighborhood Double feedback loop involving TP53, RB1, CDK4, FOXM1, and MYC Basal LumA AKT signaling Basal LumA Mutation Association to Pathways What pathway activities is a mutation’s presence associated? Can we classify mutations based on these associations? PARADIGM Signatures Mutations Ted Goldstein Mutation Association to Pathways What pathway activities is a mutation’s presence associated? Can we classify mutations based on these associations? PARADIGM Signatures Mutations APC and other Wnt (Note: CRC figure below; soon for BRCA) Ted Goldstein Mutation Association to Pathways What pathway activities is a mutation’s presence associated? Can we classify mutations based on these associations? PARADIGM Signatures Mutations Ted Goldstein Mutation Association to Pathways What pathway activities is a mutation’s presence associated? Can we classify mutations based on these associations? PARADIGM Signatures Mutations TGFB Pathway mutations (Note: CRC figure below; soon for BRCA) Ted Goldstein Mutation Association to Pathways What pathway activities is a mutation’s presence associated? Can we classify mutations based on these associations? PARADIGM Signatures Mutations PIK3CA, RTK pathway, KRAS (Note: CRC figure below; soon for BRCA) Ted Goldstein Mutation Association to Pathways What pathway activities is a mutation’s presence associated? Can we classify mutations based on these associations? PARADIGM Signatures Mutations Evidence for AHNAK2 acting PI3KCA-like? (Note: CRC figure below; soon for BRCA) Ted Goldstein Summary • Modeling information flow on known pathways gives view of gene activity. • Patient stratification into pathway-based subtypes • Sub-networks provide pathway-based signatures of subtypes and mutations. • Loss- and gain-of-function predicted from pathway neighbors for even rare mutations. • Identify interlinking genes associated with mutations to implicate additional targets even in LOF cases • E.g. Target MYC-related pathways in certain TP53deficient cells? • Current work: Use pathways to now simulate the affects of specific knock-downs. UCSC Integrative Genomics Group Marcos Woehrmann Sam Ng Dan Carlin Evan Paull Ted Golstein James Durbin Artem Sokolov Yulia Newton Chris Szeto Chris Wong David Haussler UCSC Cancer Genomics • Kyle Ellrott • Brian Craft • Chris Wilks • Amie Radenbaugh • Mia Grifford • Sofie Salama • Steve Benz Jing Zhu UCSC Genome Browser Staff • Mark Diekins • Melissa Cline • Jorge Garcia • Erich Weiler Acknowledgments Buck Institute for Aging Chris Benz, • Christina Yau • Sean Mooney • Janita Thusberg Collaborators • Joe Gray, LBL • Laura Heiser, LBL • Eric Collisson, UCSF • Nuria Lopez-Bigas, UPF • Abel Gonzalez, UPF Funding Agencies • • • • • • Broad Institute • Gaddy Getz • Mike Noble • Daniel DeCara NCI/NIH SU2C NHGRI AACR UCSF Comprehensive Cancer Center QB3 UCSC Cancer Browser genome-cancer.ucsc.edu Jing Zhu