* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Effective population size
Genome (book) wikipedia , lookup
Medical genetics wikipedia , lookup
Heritability of IQ wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Frameshift mutation wikipedia , lookup
Viral phylodynamics wikipedia , lookup
Point mutation wikipedia , lookup
Human genetic variation wikipedia , lookup
Koinophilia wikipedia , lookup
Genetic drift wikipedia , lookup
LectureVI:NeutralTheory Effective population size Given that the size of most natural populations (in terms of simple body counts) is large, one may question the role of drift. However, as we will see now, the relevant ‘effective’ population size is often much smaller than the actual number of individuals, or census population size Nc. We have derived our results for drift for the Wright-Fisher model under a number of restrictive – and in general unrealistic – conditions. Most importantly, we have assumed that population size is constant, that mating is random and there are no separate sexes. In such an ‘ideal population’ genetic drift will proceed at a rate given directly by the census population size Nc. However, in natural populations the variance in reproductive success is generally much larger than assumed by Wright-Fisher sampling (binomial or Poisson as N gets large). Examples of forces increasing variance in reproductive output are: • sex ratio differences (sexual selection) • fluctuations in population size • overlapping generations • population structure All of these forces increase variance in reproductive success and thereby reduce the number of individuals effectively contributing to the next generation. It is thus intuitive that the ‘effective’ population size will be smaller than the census size. In humans, who at present roam the planet in their billions, the effective population size is estimated to be around 10,000! But, if we violate the assumptions of the Wright-Fisher model, is this abstract mathematical model still valid? As it turns out, the model can nevertheless be applied if we replace the census population size Nc by an effective population size Ne. Ne then reflects the size of an ideal population that experiences genetic drift at the rate of the population in question. Hence, if we are able to transform Nc to Ne in some meaningful way, we can still quantify the rate at which genetic diversity gets lost through genetic drift using the same mathematical model. It is important to note that we need some read out to measure the effects of genetic drift and calibrate the effective population size accordingly. This can be the loss of heterozygosity in a population, the degree of inbreeding, genetic variance, the efficiency of selection, or the rate of coalescence. Accordingly, the effective population size is defined by the quantity of interest as e.g. the inbreeding effective size, the coalescence effective size, etc. Below, we derive Ne using the decrease in heterozygosity. Let’s see how that can work using an example. Fluctuations in population size One obvious factor bearing on the effective population size is change in real population sizes through time. The Wright-Fisher model assumes constant size, so can we approximate fluctuation in Nc in terms of an idealized population of constant size? First, it is important to understand that the lowest population numbers determine, to a large extent, the overall effective population size: all future offspring will be descendants from of these few survivors. The effect of variation in population size can be shown by examining the heterozygosity over time. Remember 1 ! 𝐻! = 1 − 𝐻! 2𝑁 LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 1 LectureVI:NeutralTheory If N varies from population to population, then 𝐻! 1 = 1− 𝐻! 2𝑁! 1 1− 2𝑁! 1 1 1− … 1− = 2𝑁! 2𝑁!!! !!! 1− !!! 1 2𝑁! The overall effective population size is the one that causes the same reduction in heterozygosity as the varying Ni values and thus !!! 1− !!! 1 1 = 1− 2𝑁! 2𝑁! ! Solving for Ne we get 𝑡 1 𝑁! where t is the number of discrete generations of fluctuating size. To illustrate the importance of a bottleneck imagine an insect population N that increases 10-fold for two summer generations and returns to its original size in winter (due to winter mortality). Population sizes are hence: N, 10N, 100N. The mean census number Nc across all three generations is 36.7 N. However, the effective populations size Ne that matters and appropriately describes the effects of genetic drift (such as reduction in heterozygosity, increase in variance in allele frequencies) is 3/(1+1/10+1/100)=2.7 N, more than an order of magnitude less. In this case Ne/Nc=0.074, i.e. Ne is only 7.4% of the census size. 𝑁! = Sex ratio differences Ne is similary reduced compared to Nc if we consider highly unequal contributions of males and females to the next generation. Imagine a zoo population of primates with 20 males and 20 females. Due to dominance hierarchy only one of the males actually breeds. What is the relevant population size that informs us about the strength of drift in this system? .. 40? .. or 21? It can be shown that for this situation 𝑁! ≈ 4𝑁! 𝑁! 𝑁! + 𝑁! where Nf is the effective size of breeding females (20 in our case), Nm is the effective size of breeding males (only 1 in our example). We thus obtain 𝑁! ≈ 4 ∙ 20 ∙ 1 80 = ≈4 20 + 1 21 The effective population size is thus an order of magnitude smaller than the census size due to fact that all kids come from the same father, and genetic variation will rapidly disappear. In the case of equal sex ration Nf=Nm = N/2, however, we obtain 𝑁! ≈ 𝑁! ; the Wright-Fisher model thus applies with the original census size Nc. LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 2 LectureVI:NeutralTheory The Neutral Theory of Evolution: Genetic Drift + Mutation In the introductory lecture we have touched upon an important debate of the 1930-60ies. Proponents of the classical school and the balancing school differed strongly in their view on the extent of genetic diversity expected in natural populations and the responsible mechanism. Both focused on morphological or physiological characters with a clear role for selection. First data from allozyme electrophoresis (Lewontin and Hubby 1966) suggested that selection alone could not be responsible for maintaining the high observed levels of polymorphism. At around the same time in 1968 Motoo Kimura studied long-term protein evolution. His observation of similar evolutionary rates across lineages prompted him to develop the Neutral Theory of Evolution stating that most changes at the molecular level resulted from a combination of mutation and genetic drift, without the action of selection. In his theory selection is appreciated only in the form of strong purifying selection efficiently removing highly deleterious mutations such that they do not contribute to segregating genetic variation. Positively selected mutations play only a minor role. They are assumed to rapidly reach fixation, and hence do not contribute to segregating variation. The Neutral Theory of Evolution and its extension in the Nearly Neutral Theory of Evolution introduced by Tomoko Ohta (relaxing the assumptions on strong purifying and positive selection) are widely accepted as models appropriately describing sequence evolution across large parts of the genome. So how can we predict the level of genetic variation using Neutral Theory? We have seen that mutations introduce genetic variation, and that genetic drift erodes it in populations of finite size. The Neutral Theory combines both forces into one framework making predictions on the level of genetic diversity we expect at equilibrium. Heterozygosity H assumes a predictable equilibrium value We have already derived that genetic drift reduces the heterozygosity within a population each generation by ΔdriftH=-1/2NeH. Mutation will work against that reduction by increasing genetic variation. Therefore, at some point an equilibrium will be reached where the decrease in H due to drift is balanced by the increase due to mutation (mutation-drift equilibrium). To find the point of equilibrium, we first derive the change of H under mutation alone. For an infinite population (no drift at this point), the heterozygosity in the new generation before mutation equals the heterozygosity in the parent population, H’ = H (Hardy-Weinberg equilibirium). We now assume every new mutation results in a new allele not present in the population before (see infinite alleles model). This is realistic if we distinguish alleles of a gene on the level of allelic types (haplotypes, protein electrophoresis) without keeping track on how these types relate to each other. Then every pair of genes with unequal alleles before mutation will also have unequal alleles after mutation. Every pair of genes with equal alleles before mutation will become heterozygote if either of the genes mutates. Summing over these two cases and ignoring terms proportional to u2 (both genes mutate), we obtain: H’= H + 2u(1-H) thus ΔmutH=2u(1-H) The total change of heterozygosity ΔH= ΔdriftH+ ΔmutH. The equilibrium is obtained for ΔH=0: LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 3 LectureVI:NeutralTheory 2u(1-H)-H/2 Ne =0 2u-2uH=H/2 Ne 4Nu-4 NeuH=H 4Nu=H+4 Ne uH 4Nu=H(1+4 Ne u) Writing θ for 4Neu we obtain 𝐻= 𝜃 1+𝜃 (The quantity 4Neu is central in population genetics and is generally denoted by θ). As expected, the equilibrium heterozygosity increases with increasing mutation rate and increasing population size (i.e. reduced drift). Expressed in terms of homozygosity G (=1-H) we obtain: 1 𝐺= 1+𝜃 If we sampled an individual from a randomly mating population, we would expect the proportion of loci for which the individual is heterozygous θ/(1+ θ). In terms of DNA sequence under the infinite sites model, we would interpret 𝐻 as the probability that two haplotypes are non-identical. The expected number of mutations (changes in the DNA sequence) occurring in the history of a sample is given by θ = 4Neu 0.0 0.2 0.4 ^ H 0.6 0.8 if we assume that the infinite sites model holds (each mutation creates a new variable site). We can think of θ as the population mutation pressure determining 1) the number of differences we expect between two randomly sampled DNA sequences and 2) the probability that they the sequences are not identical, i.e. that they are heterozygote (Fig. 1). θ depends on both the mutation rate and the population size. This has e.g. interesting implications for sex chromosomes that only have ¾ Ne of autosomes, but also differ in mutation rate (see malebiased mutation). 0 2 4 6 8 10 θ Figure1:Theequilibriumlevelofheterozygosityincreasesasafunctionoftheproductofthe neutralmutationrateandeffectivepopulationsize. LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 4 LectureVI:NeutralTheory The Neutral Theory also makes a clear prediction about the degree of divergence expected between two populations / species that are not connected by migration establishing the expectation for the rate of neutral substitution (i.e. the alleles no longer segregate, but are fixed). The probability of fixation of a new mutation is 1/2N. We have previously seen that under Wright-Fisher sampling the probability of fixation for any allele is equal to its frequency in the population. A novel mutation always enters a population at frequency 1/2N, its fixation probability is thus 1/2N. We have also shortly discussed that at some point all gene copies of a population will have descended from a single common ancestor (coalescent). Assume now that a mutation happened in this early ancestral generation. Obviously, it will only spread to fixation (and can thus be observed) only if it has occurred in the common ancestor. Otherwise it will be lost. If there are 2N gene copies in the ancestral generation, the fixation probability is thus 1/2N. The rate of nuclear substitution is equal to the mutation rate. Next, we want to calculate the neutral rate of substitution k, defined as the number of all mutations that arise in a population times the probability that any of those mutations is fixed. If the mutation rate per site and generation is u, 2Nu mutations will arise every generation at the site. We thus have k = (2N u) • (1/2N) = u The rate of substitution is just equal to the rate of new mutations, independent of the population size. This is one of the most famous and most useful results from population genetics. It implies that molecular evolution at a neutral site occurs at an approximately constant rate per unit time. It is therefore said to show a molecular clock. As a consequence, the number of substitutions between two species can be used to infer the time since these species split from their common ancestor. Importantly, the molecular clock is independent of fluctuations in population size. It assumes, however, that mutation rates are constant over time. Literature: (Barton et al. 2007; Futuyma 2013; Nielsen and Slatkin 2013) Barton NH, Briggs DEG, Eisen JA, Goldstein DB, Patel NH. 2007. Evolution. 1st edition. Cold Spring Harbor, N.Y: Cold Spring Harbor Laboratory Press Futuyma DJ. 2013. Evolution. 3rd ed. Sinauer Associates Lewontin RC, Hubby JL. 1966. A Molecular Approach to the Study of Genic Heterozygosity in Natural Populations. Ii. Amount of Variation and Degree of Heterozygosity in Natural Populations of Drosophila Pseudoobscura. Genetics 54:595–609. Nielsen R, Slatkin M. 2013. An Introduction to Population Genetics: Theory and Applications. Sunderland, Mass: Macmillan Education LectureWS EvolutionaryGeneticsPartI-JochenB.W.Wolf 5