Download Presentation Slides

Document related concepts
Transcript
Uncovering mutational and DNA repair processes
in the search for cis-regulatory mutations in
cancer genomes
Prince of Wales Clinical School
Dr Jason Wong
Senior Lecturer & ARC Future Fellow
Head, Bioinformatics and Integrative Genomics
Lowy Cancer Research Centre
Somatic mutations in cancer
Lawrence et al. Nature 2013
Identifying cancer driver mutations
• “Function” of the mutation
- Is the mutated gene important?
- Does the mutation alter the function of the gene?
BUT, function of non-protein coding genes/regions difficult to
define.
• Recurrence of the mutation
- Is the mutation present in lots of samples?
- Is the gene mutated in lots of samples?
BUT, the more samples we sequence the more recurrently mutated
genes we find.
How do mutations form and accumulate?
DNA lesion
formation
Replication and translesional DNA
synthesis
DNA repair
proteins
Lesion recognition and
repair
Formation of stable
mutation
Mutagenic processes
•
Exogenous factors (e.g. UV
light)
•
Replication errors
•
Endogenous factors (AID)
•
Viruses/retrotransposons
•
DNA repair failure
These ultimately lead to mutations
Non-coding genomic regions
• How to define functional
non-coding regions?
• What happens if we get
mutations in these noncoding regions?
Gregory Nat Rev Genetics 2005
How much functional DNA is there in the
human genome?
ENCODE suggests that as much as 80% of the genome is “functional”.
However probably only ~8% is truly important – i.e. are:
• Protein coding
• Non-coding RNA
• Regulatory sequences
Rands et al PLoS Genetics 2014
What are gene regulatory regions?
TSS
Standard histone
Nucleomsome free region
Genic region
H2A.Z histone
Post-translation
modification
How to find regulatory sequences?
Transcription factor
ChIP-seq
Histone
ChIP-seq
DNase-seq
FLI1
ERG
LMO2
TF
ChIP-seq
SCL
GATA2
LYL1
RUNX1
H3K27ac
Histone
ChIP-seq
H3K4me1
H3K4me3
DNase-seq
RNA-seq
Beck D… Wong JWH*, Pimanda JE*. Blood 2013
Do cis-regulatory mutations exist?
Wild-type
Mutant
Review: Poulos RC, Sloane MA, Hesson LB, Wong JWH (2015) Oncotarget 6(32):32509-25
How do we find these mutations?
1. Whole cancer genome sequencing data.
2. Cell type (ideally sample) specific cis-regulatory data.
OncoCis - Annotation of cis-regulatory mutations in cancer
Unique features of OncoCis
1. Cell type specific epigenetic
annotations.
2. Ability to assess TF motif
creating mutations.
3. Integrate gene expression
information.
Cell types available in OncoCis
Cell/tissue type
Lung
Prostate
Liver
Blood
Blood
Breast
Melanocytes
Cervical
Colon
Pancreas
Astrocyte
Osteoblast
Mesenchymal stem cell
Neural progenitor cell
Embryonic stem cell
Cell line name
A549
LNCaP
HepG2
K562
CD34
HMEC
Melano
HeLa
HCT116
PANC-1
NHA
Osteo
MSC
NPC
ESC
Description
Alveolar basal epithelial adenocarcinoma
Prostate epithelial adenocarcinoma
Hepatocellular epithelial carcinoma
Chronic myelogenous leukemia
CD34+ mobilised hematopoeitic stem/progenitor cells
Normal human mammary epithelial cells
Normal foreskin melanocytes
Cervival epithelial adenocarcinoma
Colon epithelial carcinoma
Pacreatic epithelioid carcinoma
Normal human astrocytes
Normal human osteoblasts
Human mesenchymal stem cell, differentiated from ES cells
Human neural progenitor cells, differentiated from ES cells
Human embryonic stem cells, undifferentiated
DNase I, H3K4me1, H3K4me3 and H3K27ac for all cell types from ENCODE or Epigenome Atlas
Annotation of TERT promoter mutations
Analysis of 17 whole breast cancer genomes
Annotated using Human Mammary Epithelial Cell line (HMEC) data
Integration with expression
18 mutations:
1. With significant change in expression relative to other samples without the mutation
2. Is within DHS and at least 1 histone mark
3. Is conserved
4. Creates or removes a motif
Validation of CDK6 mutation
3000
PD4107a
2000
1000
0
Samples
p = 0.013
10000
8000
6000
4000
2000
c/
C
D
K
6m
ut
t
SV
/lu
c/
C
D
K
6w
SV
/lu
SV
/lu
c
0
Perera D… Wong JWH (2014) Genome Biol 15:485
powcs.med.unsw.edu.au/oncocis
How many mutations identified by OncoCis are
truly functional?
COLO-829 cell line
•
•
•
Metastatic melanoma cell line from 45 yr male.
Matched “normal” cell line from B cells valiable (COLO-829BL)
One of the first cell lines to be whole genome sequenced by the Sanger Institute (Pleasance et al Nature
2010)
Assessing the function of promoter mutations in
COLO-829 malignant melanoma cell line
Substrate
Promoter
Luciferase reporter
Luciferase protein
Light?
Wild-type or
mutant
4 out of 23 promoters with mutations tested showed significant change in promoter
activity
NDUFB9 promoter mutation is recurrent in
malignant melanoma
No evidence that NDUFB9 promoter mutations are
functional in vivo
n.s.
6 out of 16 “non-functional” promoter mutations were also
recurrent!
GDAP1 (14%), PES1 (8%), STK19 (8%), WDR3 (3%), GPATCH2L
(3%), BLCAP (3%)
Poulos RC, Thoms JAI, Shah A, Beck D, Pimanda JE, Wong JWH (2015) Mol Cancer Res 13:1218-1226
Promoter mutations are frequent but how important
are they in cancer?
Weinhold et al. (2014) Nat Genetics 46 1160-1165
Fredriksson et al. (2014) Nat Genetics 46 1258-1263
Melton et al. (2015) Nat Genetics 47 710-716
•
•
•
There quite a few recurrent mutations in gene promoters (18 in >5% of cancers).
But TERT promoter mutations are exceptional – only ones with real strong links to function.
Therefore, no strong evidence that promoter mutations are a major player in cancer.
Systematic analysis of cis-regulatory mutations
• Somatic point mutations from
1,161 whole cancer genome
sequenced samples across 14
cancer types from the ICGC,
M
ed
ul
C
lo
LL
bl
as
to
As
tro ma
cy
to
m
a
Li
ve
Ly
r
m
ph
o
Pa ma
nc
re
at
ic
Br
ea
st
R
en
al
Pr
os
ta
te
M
el
an
om
datasets.
a
Lu
ng
Co
lo
n
O
va
ri
Es
op an
ha
ge
al
TCGA and various other public
• Annotated mutations using
ENCODE/Epigenome Atlas data
from 14 cell line/tissues.
M
e
As lan
tro om
cy a
to
m
a
Lu
O n
Es va g
op ria
ha n
g
Pr ea
l
o
Pa sta
nc t e
re
M
at
ed
ul B ic
lo re
bl as
as t
to
m
a
Li
ve
r
Ly Re
m na
ph l
om
a
C
LL
C
ol
on
Mutation density ratio
(DHS/ 1kb DHS flank)
Promoter/Enhancer
flank
DHS centre
(150 bp)
flank
DNase I endonuclease
(DHS = DNase I Hypersensitive Site)
Identifying underlying causes for increased
promoter mutation density
Melanoma
DNase I coverage
Gene expression
GC content
Replication timing
Proportion rare SNP
Cancer gene
Conservation (GERP)
OR
1.51
1.08
1.04
1.03
0.97
0.90
0.85
95% CI
1.45,1.56
1.04,1.13
1.00,1.08
0.99,1.08
0.95,1.00
0.69,1.16
0.82,0.89
p-value
< 2E-16
0.00032
0.0891
0.112
0.078
0.432
7.69E-14
adj. OR
1.56
0.87
0.92
0.98
0.81
0.88
adj. 95% CI
1.50,1.63
-,0.23,3.26
0.88,0.97
0.95,1.01
0.62,1.05
0.84,0.82
adj. p-value
<2E-16
0.8313
0.0005
0.2042
0.1272
1.22E-08
OR
1.25
1.38
1.10
0.99
1.00
1.34
0.95
95% CI
1.17,1.32
1.14,1.67
1.02,1.18
0.92,1.06
0.95,1.03
0.89,1.95
0.88,1.02
p-value
5.07E-13
0.00101
0.0149
0.682
0.7867
0.133
0.13
adj. OR
1.23
0.35
0.91
0.99
1.25
0.97
adj. 95% CI
1.15,1.31
-,0.0,3.6
0.85,0.99
0.95,1.02
0.82,1.80
0.90,1.05
adj. p-value
3.11E-09
0.3717
0.0185
0.7067
0.2693
0.4975
OR
1.09
1.03
1.18
0.87
1.00
1.21
0.91
95% CI
1.03,1.16
0.97,1.10
1.11,1.27
0.82,0.93
0.97,1.04
0.82,1.72
0.85,0.98
p-value
0.00321
0.376
1.18E-06
2.10E-05
0.888
0.321
0.00867
adj. OR
1.15
1.36
0.81
0.98
1.17
0.92
adj. 95% CI
1.07,1.22
-,0.16,11.63
0.75,0.87
0.93,1.03
0.79,1.67
0.85,0.98
adj. p-value
3.01E-05
0.7769
3.76E-10
0.5618
0.4089
0.0133
Ovarian Cancer
DNase I coverage
Gene expression
GC content
Replication timing
Proportion rare SNP
Cancer gene
Conservation (GERP)
Lung Cancer
DNase I coverage
Gene expression
GC content
Replication timing
Proportion rare SNP
Cancer gene
Conservation (GERP)
Mutation density is dictated by chromatin accessibility
Mutation density is dictated by chromatin accessibility
Mutation density is dictated by chromatin accessibility
Mutation density is dictated by chromatin accessibility
Regions with increased mutation rates coincides
with in transcription factor binding site.
Digital genomic footprinting
Neph et al. Nature 2012
Differential NER is responsible for increased promoter
mutation density
Use mutations from genomes of people without NER
(i.e. xeroderma pigmentosum – XPC-/-)
Whole SCC genomes of XPCwildand XPC-/- patients from
Zheng et al. (2014) Cell Reports
9:1228
type
Mutations/mb
Repair
(normalised read count)
Mutational signatures
Use mutations patterns to determine what mutagen has cause the cancer
-
Currently there are 30 signatures identified.
Signature 1 associated with aging, signature 7 associated with UV exposure, etc
Mutagen underlying many signature still unknown.
Alexandrov et al. (2013) Nature 500:415-421
http://cancer.sanger.ac.uk/cosmic/signatures
Mutations (%)
Melanoma mutation signatures
Mutations (%)
Lung cancer mutation signatures
Transcription initiation is necessary for impaired NER
What about enhancers?
Andersson et al Nature 2014
Promoters (Top 25% DHS)
Ubiquitous enhancers (n=200)
Enhancers (Matched DHS)
Permissive enhancers (Matched DHS)
4
***
***
6
3
4
2
el
an
M
om
an
Lu
ng
0
om
0
Lu
ng
1
a
O
va
ria
n
2
el
*
***
a
O
va
ria
n
***
M
p=0.37
GG-NER
machinery
Lesion recognition
by XPC is occluded
Lesion recognition
and repair
Transcription preinitiation complex
UV
light
DNA damage
Enhancer
TSS
Promoter
• Active transcription initiation inhibits NER.
• Promoter mutation hotspots are present in cancers where NER is
necessary to repair DNA lesions.
• Is the mechanism a potential source of cancer causing mutations? Or
are these mutations well tolerated?
Perera D*, Poulos RC*, Shah A, Beck D, Pimanda JE, Wong JWH (2016) Nature 532: 259-263
CTCF binding sites are highly mutated in cancer
Cohesin complex
Ong & Corces Nature Rev Genetics 2014
CTCF
CTCF binding site mutations are also highly
mutated in skin cancers
Skin cancers form CTCF motif mutations at
specific and unique positions
Other cancers:
~45% Oesophageal adenocarcinoma
~20% Hepatocellular carcinoma
~15% Gastric adenocarcinoma
~3% Colorectal adenocarcinoma
Katainen et al Nature Genetics 2014
CTCF mutations only accumulate at cohesin
loops
Allele specific CTCF binding in COLO-829
COLO829
mutations
WGS
CTCF ChIP-seq
(rep1)
CTCF ChIP-seq
(rep2)
Reads
WT: 8
Mutant: 10
ChIP-seq
IgG ChIP-seq
DNase-seq
H3K27ac
(Melanocytes)
Reads
WT: 77
Mutant: 6
Allele specific CTCF binding in COLO-829
***
1.0
0.8
0.6
0.4
0.2
Pe
a
k
n=
92
ot
if)
,
(n
on
-m
M
ot
if,
n=
12
0.0
Is the lost of CTCF binding important in melanoma?
CTCF loop
anchors
Mutations
CTCF ChIPseq
DNase-seq
Neighbourhood genes
Genes in mutated
neighbourhoods
n=47 skin cancers
ASB8
COL2A1
PFKM
SENP1
TMEM106C
VDR
IRF8
RASD2
CACUL1
CNDP2
FAM69C
GLI2
GPR37
GRIN2B
IFNK
IGSF9
IKBKB
MOB3B
NANOS1
PLAT
PPP1R1C
PRLHR
SLAMF9
SMOC1
# mutated samples
6
4
2
0
SSFA2
Poulos RC, Thoms JAI, Guan YF, Unnikrishnan A, Pimanda JE & Wong JWH Cell Reports 2016.
Hnisz et al. Science (2016) 351:1454-1458
Ji et al. Cell Stem Cell (2015) 18:262-275
Summary – Part 1
•
Impaired nucleotide excision repair (NER) results in mutation
hotspots at active promoters and CTCF/cohesin binding sites.
•
This is most obvious in melanoma due to the dependence of NER to
fix UV-induced lesions.
• Is this a mechanism that generally contributes to cancer
development or is largely a passenger event?
• Are CTCF binding site mutations in other cancers (with
signature 17) also caused by a similar mechanism?
Why is there a lack of promoter mutations in
colorectal cancer?
Colon adenocarcinoma
Mutations/mb
50
40
30
20
10
Promoter DHS
25
00
50
00
75
00
10
00
0
0
-1
00
00
-7
50
0
-5
00
0
-2
50
0
0
Distance from DHS (bps)
DNA methylation driven mutagenesity
DNA methylation underlies decreased promoter
mutations in colorectal cancers
Replication timing has different impact on
mutation rates in different types CRCs
Repair of DNA methylation is dependent on MMR
and TDG
Base excision repair (Thymine deglycosylase)
me
----CG-------GC---me
me
deamination
----TG-------GC----
----CG-------GC----
me
me
Mismatch repair
(Replication dependent)
DNMT
----CG-------GC---me
Modelling mutation probability based on
epigenetic factors
How can understanding mutational processes help?
Summary – part 2
•
DNA methylation drives CpG mutation accumulation rate in
colorectal cancers.
•
Data shows that mismatch repair normally repairs most mCpG
driven mismatches.
• Investigate how mutations in other cancers are driven by this
mCpG phenomenon.
• Develop cancer type specific models to improve cancer driver
mutation prediction.
Acknowledgements
All the research groups that have made their data publically available – TCGA, ICGC,
Sanger Institute, ENCODE, FANTOM5, etc.
Bioinformatics and Integrative Genomics
Rebecca Poulos
Dilmi Perera
Anushi Shah
Diego Chacon
Regina Ryan
Felix Ma
Stem Cell Research Group
A/Prof. John Pimanda
Dr Julie Thoms
Dr Ashwin Unnikrishnan
Yi Fang Guan
Centre for Health Technologies, UTS
Dr Dominik Beck
Intersect Pty Ltd.