* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download See a Sample
Oncogenomics wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genomic library wikipedia , lookup
Transposable element wikipedia , lookup
Gene expression programming wikipedia , lookup
Public health genomics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene desert wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Non-coding DNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Human genome wikipedia , lookup
Genome (book) wikipedia , lookup
Metagenomics wikipedia , lookup
Genome editing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Pathogenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Minimal genome wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Microevolution wikipedia , lookup
Chapter 10 Comparative Genomics Insights gained through comparison of genomes from different species © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Contents         History Synteny Conservation and function Sequence similarity searches Gene finding Regulatory sequence identification Interaction mapping Genes and evolution © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 History  Human Genome Project decided to use smaller genomes as warm-up for human genome  Resulted in sequencing:  Many bacteria  Model organism genomes  Yeast, C. elegans, Arabidopsis, Drosophila Comparison of these genome sequences provided basis for field of “Comparative Genomics” © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Early comparative genomics  Comparative genomics prior to obtaining full genome sequence:  Genome size  Compared DNA content among species  Single copy and repetitive DNA  Used hybridization kinetics  Found amount of repetitive DNA differed greatly among species © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Synteny  Synteny: genes that are in the same relative position on two different chromosomes  Genetic and physical maps compared between species  Or between chromosomes of the same species  Closely related species generally have similar order of genes on chromosomes  Synteny can be used to identify genes in one species based on map-position in another © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Synteny of Grass genomes  Synteny among crop genomes: rice, maize and wheat  Rice is smallest genome in center  Wheat largest - outer circle  Genes found in similar places on chromosomes are indicated © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Synteny of sequenced genomes  When sequence from mouse and human genomes compared:  Find regions of remarkable synteny  Genes are in almost identical order for long stretches along the chromosome Human Chr 14 Mouse Chr 14 © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Mouse/human synteny © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Comparing sequenced genomes  Comparison of genomic sequences from different species can help identify:     Gene structure Gene function Regulatory sequences Interactions between gene products © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Evolution and sequence conservation  Genome comparisons based on observation: conservation = function  If no constraints on DNA sequence  Random mutations will occur  Over tens of millions of years these random mutations will make two related sequences different © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Function and sequence conservation  However: if there are constraints:  e.g. DNA codes for protein  Or transcription factor binds DNA  Then there will be sequence similarity when related sequences compared  Basic rule when comparing two related sequences:  Sequence conservation = functional importance © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Orthologs and Paralogs  When comparing sequence from different genomes  Must distinguish between two types of closely related sequences:  Orthologs are genes found in two species that had a common ancestor  Paralogs are genes found in the same species that were created through gene duplication events © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Orthologues and Paralogues A A’ A’’ B” B’ B © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Sequence similarity and gene function  Sequence comparisons that implicate function are widely used:  To determine if newly sequenced cDNA or genomic region encodes gene of known function  Search for similar sequence in other species (or in same species) © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Homology searches  Search databases of DNA sequences  Use computer algorithms to align sequences  Don’t require perfect matches between sequences  Allow for insertions, deletions and base changes  Most commonly used algorithms:  BLAST  FAST-A © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Homology search example  The seasquirt, Ciona intestinalis makes a coat primarily of cellulose  A BLAST search was performed on the Ciona genome using an Arabidopsis endoglucanase gene involved in cellulose synthesis  Extensive homology was found with a Ciona gene flanked by genes found in Drosophila and human  It is postulated that the Ciona endoglucanase gene may have arisen by lateral gene transfer © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Discovery of endoglucanase gene in Seasquirt genome Arabidopsis Korrigan Transporter Endoglucanase Splicing factor C. intestinalis cDNA C. elegans and Drosophila Human © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Homology search for the mouse genome  Homology search of all genes in the mouse genome :  27% in other metazoans  29% in other eukaryotes  6% in other chordates  14 % in other mammals  Less than 1% rodent specific © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Problems of Genome annotation  Identifying genes and regulatory regions in sequenced genomes is challenging  Open reading frames (ORFs) are usually good indication of genes  Problem is: difficult to determine which ORFs belong to a gene  Many mammalian genes have small exons and large introns  Regulatory sequences even more difficult © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Computational approaches to gene identification  Computer programs analyze genomic sequence  GRAIL, GeneFinder  Look for ORFs, splice sites, poly A addition sites etc.  Predict gene structure  Frequently wrong  Usually miss exons at beginning or end of gene  Or predict exon when doesn’t really exist © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 How genome comparisons help  When comparing genomes of different species  Genes normally have same exon/intron structure  Look for conserved ORFs in both genomes  Frequently permits accurate identification of genes  Fugu/human comparison found >1000 genes  Mouse/human comparison indicates only 30,000 genes in genome © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Sequence comparison example  Comparison of the human and mouse spermidine synthase genes  Revealed an additional intron in the human gene that is not found in the mouse homologue Human Mouse 5,500 bp © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Identifying small RNAs  Growing evidence that small RNAs can regulate gene expression  Small RNAs are 20-25 bases  Conservation between genomes suggests functionality  Example:Small RNAs conserved in Arabidopsis and rice © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Regulatory sequence identification  A large portion of the genome contains regulatory information  Regulatory sequence includes:  Cis-regulatory elements: tell genes when and where to turn on  Basal transcription machinery binding sites  Enhancers  Can be 5’ of gene, 3’ of gene or in intron © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Regulatory sequences 5’ TATA 3’ © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Finding regulatory sequences  Regulatory sequences are difficult to identify using computer programs  Problem is: most enhancer sequences have yet to be identified  They are usually short: 6-10 basepairs  Those that are known are usually degenerate  They can differ in one or more basepairs  Still bind the cognate transcription factor © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Comparisons to identify regulatory elements  Comparisons of genomes of different species can identify regulatory elements  Change in intergenic regions and introns usually more rapid than in coding regions  Nevertheless, regulatory elements tend to be conserved  Conserved regions called “phylogenetic footprint” © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Phylogenetic footprint  To identify conserved regulatory regions usually requires comparing genomes of closely related species  If too distantly related, very difficult to find conservation  Nevertheless, mouse/human sequence comparison has revealed many conserved cisregulatory elements © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Mouse/human comparison © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Using multiple species for Phylogenetic footprinting  The location of regulatory sequences can also be found comparing several related sequences  Multiple alignments performed  Better able to home in on important regions  Conservation alone not enough, need to validate importance of elements © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Interaction mapping  Protein-protein interactions include:  The transfer of information in a genetic pathway  Scaffolding to tether other proteins  Enzymatic reactions  Large molecular machines such as motors © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Rosetta Stone  Observation: in some species, interaction proteins encoded by single gene  In other species same proteins encoded in two genes  Systematic search through sequenced genomes for these relationships should identify proteins that interact  Called “Rosetta Stone” approach © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Rosetta Stone example  Equivalent of yeast protein topoisomerase II  In E. coli two proteins: gyrase A and gyrase B  Suggests gyrase B and gyrase A interact Yeast topoisomerase II E. coli gyrase B gyrase A © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Rosetta stone Escherichia coli Haemophilus influenzae Methanococcus jannaschii © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Higher level comparisons  Comparisons between genomes not just to better identify genes and regulatory sequences  Evolution of adaptive traits occurs through:  Evolution of new genes  Changing when and where genes express  Thus comparisons of genes found in genome can provide information about mechanisms of evolution © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Genes and genomes  Comparison of total gene numbers in sequenced genomes:  Smaller than originally expected  Ex: Human genome thought to have 100,000 genes  Now think closer to 30-35,000 genes  Suggests that many new functions arise in gene expression  Use old genes in new ways © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Selective expansion of genes  Although comparisons show not as much difference in numbers of genes as expected  Still see striking differences in numbers of some gene families  Example:  Roundworm C. elegans has a large number of nuclear receptor genes  Drosophila has large number of zinc-finger transcription factors  Plants have no G-protein coupled receptors © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 What is difference between man and ape?  Man and chimpanzee have a genome wide similarity of greater than 95%.  What accounts for differences in species?.  Recent study suggests due to specific gene expression differences.  Striking differences found only in brain © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Human/ape gene expression comparisons 1.3 Human 1.0 Chimp Human Chimp Human 5.5 Chimp Rhesus Rhesus Rhesus © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Trait-to-gene  Methods being developed to identify genes involved in adaptive traits  Example: “Trait-to-gene”  Underlying reasoning:  Organisms that have a particular trait either share related genes  Or have developed new genes to perform same function © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Relating traits to genes Species 1 Species 2 Trait A Trait A Gene Gene Species 3 Trait A Gene Species 4 Species 5 Trait A Gene COG 3 © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Trait-to-gene  Comparisons made of bacterial genomes  Need many genomes  Looked for genes involved in flagellar function  Identified 43 of 45 known genes  Found 5 additional genes that program said should be involved in flagella function  Knocked out 3 and found that 2 resulted in bacteria with defective flagella © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Trait-to-gene B. subtilis 168 yqeW yuxH B. subtilis 168 Overnight growth at 37°C. Swim medium (LB + 0.25% agar). Similar results at 20°C (4 days) and 30°C (2 days). © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 The goal of comparative genomics © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458 Summary  Synteny = similar relative positions of genes on chromosomes  Conservation = function  Homology searches  Gene structure prediction  Regulatory sequence identification  Interaction mapping  Genes and evolution © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            