Razib Khan One-stop-shopping for all of my content

October 5, 2017

The Loneliest Neanderthal

Filed under: Evolutionary Genetics,Neanderthals,Paleoanthropology — Razib Khan @ 7:30 pm

Neanderthals are in the news again! This is good for me personally, as my company is selling Neanderthal trait analysis. Ooga-booga!

In any case, the two papers which have triggered the current wave of Neandermania are The Contribution of Neanderthals to Phenotypic Variation in Modern Humans, and A high-coverage Neandertal genome from Vindija Cave in Croatia. They are somewhat different. The first publication looks at introgressed archaic variants within modern populations. The second gets some results out of a much higher quality European Neanderthal which lived ~50,000 years ago.

The cool thing about the first paper is that it combined UK Biobank data, 100,000+ individuals, with hundreds of thousands of markers, and Neanderthal genomic data. Note that: a paper comparing ancient genomes with over 100,000 individuals and hundreds of thousands of markers. Now that’s 2017!

To find archaic alleles they:

  1. Looked for variants fixed in Yoruba (no Neanderthal), and homozygote or heterozygote in the alternative state in the Altai Neanderthal, which also segregated (varied) in the UK Biobank population. Basically, an allele not found in Africans but found in Neanderthals, and also found in appreciable fractions in the UK Biobank data set.
  2. They then took the SNPs above, and only retained ones confidently embedded in tracts of Neanderthal ancestry. Haplotype was consistent with admixture ~50,000 years ago (the length), and exhibited lower distance to Neanderthal than African genomes.

They did some stuff with tag-SNPs though. Overall they found a lot of the usual suspects. Pigmentation. Chronotype. But this passage jumped out at me:

In fact, for most associations, Neanderthal variants do not seem to contribute more than non-archaic variants. However, there are four phenotypes, all behavioral, to which Neanderthal alleles contribute more phenotypic variation than non-archaic alleles: chronotype, loneliness or isolation, frequency of unenthusiasm or disinterest in the last 2 weeks, and smoking status.

What they are saying is that for a lot of traits Neanderthals don’t really change the direction of the trait in humans, they just add more variants. This seems to be the case in pigmentation. Entirely unsurprising, Neanderthals were around for hundreds of thousands of years. Of course they had a lot of variation amongst themselves.

But the behavioral traits above shifted the modern humans in the aggregate who had the archaic allele somewhat. That is, being Neanderthal derived made a difference.

There have long been speculations about the sociality (or lack thereof) of Neanderthals. It would not be surprising if small population sizes meant that Neanderthals were less gregarious than modern humans, and that their lack of gregariousness did not redound to their benefit when they encountered the last wave of moderns.

Which brings us to the second paper. The big deal here is that it gives us a very high quality ancient genome of a European Neanderthal that lived ~50,000 years ago (the Vindija sample). Before this we had a high quality ancient genome of an Asian Neanderthal that lived ~125,000 years ago (Altai sample).  ~75,000 years is a long time. It’s so long that almost all the ancestry of modern non-Africans would have converged to a common population that long ago. Additionally, all the available data indicate that most of the admixture into modern humans from Neanderthals occurred around 50,000 years ago. So this new sample is definitely welcome.

It is not surprising that the Vindijia sample seems to be closer to the Neanderthal admixture population than the Altai sample. First, it is likely geographically closer, since all non-African populations have some Neanderthal ancestry West Asia is probably the top candidate, and southeastern Europe is not that far from West Asia in comparison to Mongolia. Second, it is basically contemporaneous with the Neanderthals who contributed ancestry to modern humans who left Africa. This means that the Neanderthal admixture percentage in non-Africans goes up moderately.

To me this is the most important paragraph:

It has been suggested that Denisovans received gene flow from a human lineage that diverged prior to the common ancestor of modern humans, Neandertals and Denisovans (2). In addition, it has been suggested that the ancestors of the Altai Neandertal received gene flow from early modern humans that may not have affected the ancestors of European Neandertals (13). In agreement with these studies, we find that the Denisovan genome carries fewer derived alleles that are fixed in Africans, and thus tend to be older, than the Altai Neandertal genome while the Altai genome carries more derived alleles that are of lower frequency in Africa, and thus younger, than the Denisovan genome (20). However, the Vindija and Altai genomes do not differ significantly in the sharing of derived alleles with Africans indicating that they may not differ with respect to their putative interactions with early modern humans (Fig. 3A & B). Thus, in contrast to earlier analyses of chromosome 21 data for the European Neandertals (13), analyses of the full genomes suggest that the putative early modern human gene flow into Neandertals occurred prior to the divergence of the populations ancestral to the Vindija and Altai Neandertals ~130-145 thousand years ago (Fig. 2). Coalescent simulations show that a model with only gene flow from a deeply diverged hominin into Denisovan ancestors explains the data better than one with only gene flow from early modern humans into Neandertal ancestors, but that a model involving both gene flows explains the data even better. It is likely that gene flow occurred between many or even most hominin groups in the late Pleistocene and that more such events will be detected as more ancient genomes of high quality become available.

These results seem to support earlier work indicate that Denisovans were admixed with an ancient hominin group which diverged very early on (probably the descendents of East Asia erectus?). And, that Neanderthals received gene flow from a lineage of modern (African?) humans 150,000 or more years ago. Since the latest work suggests that modern humans in some form have existed between from 200,000 to 350,000 years ago, this is entirely plausible.

But, it brings us the take-home message that the emergence of Pleistocene humanity was to a some extent characterized by reticulate gene flow, rather than a bifurcating tree.

May 6, 2017

Synergistic epistasis as a solution for human existence

Filed under: epistasis,Evolution,Evolutionary Genetics,Genetics,Genomics — Razib Khan @ 12:16 am

Epistasis is one of those terms in biology which has multiple meanings, to the point that even biologists can get turned around (see this 2008 review, Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems, for a little background). Most generically epistasis is the interaction of genes in terms of producing an outcome. But historically its meaning is derived from the fact that early geneticists noticed that crosses between individuals segregating for a Mendelian characteristic (e.g., smooth vs. curly peas) produced results conditional on the genotype of a secondary locus.

Molecular biologists tend to focus on a classical, and often mechanistic view, whereby epistasis can be conceptualized as biophysical interactions across loci. But population geneticists utilize a statistical or evolutionary definition, where epistasis describes the extend of deviation from additivity and linearity, with the “phenotype” often being fitness. This goes back to early debates between R. A. Fisher and Sewall Wright. Fisher believed that in the long run epistasis was not particularly important. Wright eventually put epistasis at the heart of his enigmatic shifting balance theory, though according to Will Provine in Sewall Wright and Evolutionary Biology even he had a difficult time understanding the model he was proposing (e.g., Wright couldn’t remember what the different axes on his charts actually meant all the time).

These different definitions can cause problems for students. A few years ago I was a teaching assistant for a genetics course, and the professor, a molecular biologist asked a question about epistasis. The only answer on the key was predicated on a classical/mechanistic understanding. But some of the students were obviously giving the definition from an evolutionary perspective! (e.g., they were bringing up non-additivity and fitness) Luckily I noticed this early on and the professor approved the alternative answer, so that graders would not mark those using a non-molecular answer down.

My interested in epistasis was fed to a great extent in the middle 2000s by my reading of Epistasis and the Evolutionary Process. Unfortunately not too many people read this book. I believe this is so because when I just went to look at the Amazon page it told me that “Customers who viewed this item also viewed” Robert Drews’ The End of the Bronze Age. As it happened I read this book at about the same time as Epistasis and the Evolutionary Process…and to my knowledge I’m the only person who has a very deep interest in statistical epistasis and Mycenaean Greece (if there is someone else out there, do tell).

In any case, when I was first focused on this topic genomics was in its infancy. Papers with 50,000 SNPs in humans were all the rage, and the HapMap paper had literally just been published. A lot has changed.

So I was interested to see this come out in Science, Negative selection in humans and fruit flies involves synergistic epistasis (preprint version). Since the authors are looking at humans and Drosophila and because it’s 2017 I assumed that genomic methods would loom large, and they do.

And as always on the first read through some of the terminology got confusing (various types of statistical epistasis keep getting renamed every few years it seems to me, and it’s hard to keep track of everything). So I went to Google. And because it’s 2017 a citation of the paper and further elucidation popped up in Google Books in Crumbling Genome: The Impact of Deleterious Mutations on Humans. Weirdly, or not, the book has not been published yet. Since the author is the second to last author on the above paper it makes sense that it would be cited in any case.

So what’s happening in this paper? Basically they are looking to reduced variance of really bad mutations because a particular type of epistasis amplifies their deleterious impact (fitness is almost always really hard to measure, so you want to look at proxy variables).

Because de novo mutations are rare, they estimate about 7 are in functional regions of the genome (I think this may be high actually), and that the distribution should be Poisson. This distribution just tells you that the mean number of mutations and the variance of the the number of mutations should be the same (e.g., mean should be 5 and variance should 5).

Epistasis refers (usually) to interactions across loci. That is, different genes at different locations in the genome. Synergistic epistasis means that the total cumulative fitness after each successive mutation drops faster than the sum of the negative impact of each mutation. In other words, the negative impact is greater than the sum of its parts. In contrast, antagonistic epistasis produces a situation where new mutations on the tail of the distributions cause a lower decrement in fitness than you’d expect through the sum of its parts (diminishing returns on mutational load when it comes to fitness decrements).

These two dynamics have an effect the linkage disequilibrium (LD) statistic. This measures the association of two different alleles at two different loci. When populations are recently admixed (e.g., Brazilians) you have a lot of LD because racial ancestry results in lots of distinctive alleles being associated with each other across genomic segments in haplotypes. It takes many generations for recombination to break apart these associations so that allelic state at one locus can’t be used to predict the odds of the state at what was an associated locus. What synergistic epistasis does is disassociate deleterious mutations. In contrast, antagonistic epistasis results in increased association of deleterious mutations.

Why? Because of selection. If a greater number of mutations means huge fitness hits, then there will strong selection against individuals who randomly segregate out with higher mutational loads. This means that the variance of the mutational load is going to lower than the value of the mean.

How do they figure out mutational load? They focus on the distribution of LoF mutations. These are extremely deleterious mutations which are the most likely to be a major problem for function and therefore a huge fitness hit. What they found was that the distribution of LoF mutations exhibited a variance which was 90-95% of a null Poisson distribution. In other words, there was stronger selection against high mutation counts, as one would predict due to synergistic epistasis.

They conclude:

Thus, the average human should carry at least seven de novo deleterious mutations. If natural selection acts on each mutation independently, the resulting mutation load and loss in average fitness are inconsistent with the existence of the human population (1 − e−7 > 0.99). To resolve this paradox, it is sufficient to assume that the fitness landscape is flat only outside the zone where all the genotypes actually present are contained, so that selection within the population proceeds as if epistasis were absent (20, 25). However, our findings suggest that synergistic epistasis affects even the part of the fitness landscape that corresponds to genotypes that are actually present in the population.

Overall this is fascinating, because evolutionary genetic questions which were still theoretical a little over ten years ago are now being explored with genomic methods. This is part of why I say genomics did not fundamentally revolutionize how we understand evolution. There were plenty of models and theories. Now we are testing them extremely robustly and thoroughly.

Addendum: Reading this paper reinforces to me how difficult it is to keep up with the literature, and how important it is to know the literature in a very narrow area to get the most out of a paper. Really the citations are essential reading for someone like me who just “drops” into a topic after a long time away….

Citation: ScienceNegative selection in humans and fruit flies involves synergistic epistasis.

April 23, 2017

Why the rate of evolution may only depend on mutation

Filed under: Evolutionary Genetics,Genetics,Population genetics — Razib Khan @ 10:07 pm

Sometimes people think evolution is about dinosaurs.

It is true that natural history plays an important role in inspiring and directing our understanding of evolutionary process. Charles Darwin was a natural historian, and evolutionary biologists often have strong affinities with the natural world and its history. Though many people exhibit a fascination with the flora and fauna around us during childhood, often the greatest biologists retain this wonderment well into adulthood (if you read W. D. Hamilton’s collections of papers, Narrow Roads of Gene Land, which have autobiographical sketches, this is very evidently true of him).

But another aspect of evolutionary biology, which began in the early 20th century, is the emergence of formal mathematical systems of analysis. So you have fields such as phylogenetics, which have gone from intuitive and aesthetic trees of life, to inferences made using the most new-fangled Bayesian techniques. And, as told in The Origins of Theoretical Population Genetics, in the 1920s and 1930s a few mathematically oriented biologists constructed much of the formal scaffold upon which the Neo-Darwinian Synthesis was constructed.

The product of evolution

At the highest level of analysis evolutionary process can be described beautifully. Evolution is beautiful, in that its end product generates the diversity of life around us. But a formal mathematical framework is often needed to clearly and precisely model evolution, and so allow us to make predictions. R. A. Fisher’s aim when he wrote The Genetical Theory Natural Selection was to create for evolutionary biology something equivalent to the laws of thermodynamics. I don’t really think he succeeded in that, though there are plenty of debates around something like Fisher’s fundamental theorem of natural selection.

But the revolution of thought that Fisher, Sewall Wright, and J. B. S. Haldane unleashed has had real yields. As geneticists they helped us reconceptualize evolutionary process as more than simply heritable morphological change, but an analysis of the units of heritability themselves, genetic variation. That is, evolution can be imagined as the study of the forces which shape changes in allele frequencies over time. This reduces a big domain down to a much simpler one.

Genetic variation is concrete currency with which one can track evolutionary process. Initially this was done via inferred correlations between marker traits and particular genes in breeding experiments. Ergo, the origins of the “the fly room”.

But with the discovery of DNA as the physical substrate of genetic inheritance in the 1950s the scene was set for the revolution in molecular biology, which also touched evolutionary studies with the explosion of more powerful assays. Lewontin & Hubby’s 1966 paper triggered a order of magnitude increase in our understanding of molecular evolution through both theory and results.

The theoretical side occurred in the form of the development of the neutral theory of molecular evolution, which also gave birth to the nearly neutral theory. Both of these theories hold that most of the variation with and between species on polymorphisms are due to random processes. In particular, genetic drift. As a null hypothesis neutrality was very dominant for the past generation, though in recent years some researchers are suggesting that selection has been undervalued as a parameter for various reasons.

Setting the live scientific debate, which continue to this day, one of the predictions of neutral theory is that the rate of evolution will depend only on the rate of mutation. More precisely, the rate of substitution of new mutations (where the allele goes from a single copy to fixation of ~100%) is proportional to the rate of mutation of new alleles. Population size doesn’t matter.

The algebra behind this is straightforward.

First, remember that the frequency of the a new mutation within a population is \frac{1}{2N}, where N is the population size (the 2 is because we’re assuming diploid organisms with two gene copies). This is also the probability of fixation of a new mutation in a neutral scenario; it’s probability is just proportional to its initial frequency (it’s a random walk process between 0 and 1.0 proportions). The rate of mutations is defined by \mu, the number of expected mutations at a given site per generation (this is a pretty small value, for humans it’s on the order of 10^{-8}). Again, there are 2N individuals, so you have 2N\mu to count the number of new mutations.

The probability of fixation of a new mutations multiplied by the number of new mutations is:

    \[ \( \frac{1}{2N} \) \times 2N\mu = \mu \]

So there you have it. The rate of fixation of these new mutations is just a function of the rate of mutation.

Simple formalisms like this have a lot more gnarly math that extend them and from which they derive. But they’re often pretty useful to gain a general intuition of evolutionary processes. If you are genuinely curious, I would recommend Elements of Evolutionary Genetics. It’s not quite a core dump, but it is a way you can borrow the brains of two of the best evolutionary geneticists of their generation.

Also, you will be able to answer the questions on my survey better the next time!

April 12, 2017

Fisherianism in the genomic era

Filed under: Evolutionary Genetics,Genetics — Razib Khan @ 1:07 am

There are many things about R. A. Fisher that one could say. Professionally he was one of the founders of evolutionary genetics and statistics, and arguably the second greatest evolutionary biologist after Charles Darwin. With his work in the first few decades of the 20th century he reconciled the quantitative evolutionary framework of the school of biometry with mechanistic genetics, and formalized evolutionary theory in The Genetical Theory of Natural Selection.

He was also an asshole. This is clear in the major biography of him, R.A. Fisher: The Life of a Scientist. It was written by his daughter.  But The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century also seems to indicate he was a dick. And W. D. Hamilton’s Narrow Roads of Gene Land portrays Fisher has rather cold and distant, despite the fact that Hamilton idolized him.

Notwithstanding his unpleasant personality, R. A. Fisher seems to have been a veritable mentat in his early years. Much of his thinking crystallized in the first few decades of the 20th century, when genetics was a new science and mathematical methods were being brought to bear on a host of topics. It would be decades until DNA was understood to be the substrate of heredity. Instead of deriving from molecular first principles which were simply not known in that day, Fisher and his colleagues constructed a theoretical formal edifice which drew upon patterns of inheritance that were evident in lineages of organisms that they could observe around them (Fisher had a mouse colony which he utilized now and then to vent his anger by crushing mice with his bare hands). Upon that observational scaffold they placed a sturdy superstructure of mathematical formality. That edifice has been surprisingly robust down to the present day.

One of Fisher’s frameworks which still gives insight is the geometric model of the distribution of fitness of mutations. If an organism is near its optimum of fitness, than large jumps in any direction will reduce its fitness. In contrast, small jumps have some probability of getting closer to the optimum of fitness. In plainer language, mutations of large effect are bad, and mutations of small effect are not as bad.

A new paper in PNAS loops back to this framework, Determining the factors driving selective effects of new nonsynonymous mutations:

Our study addresses two fundamental questions regarding the effect of random mutations on fitness: First, do fitness effects differ between species when controlling for demographic effects? Second, what are the responsible biological factors? We show that amino acid-changing mutations in humans are, on average, more deleterious than mutations in Drosophila. We demonstrate that the only theoretical model that is fully consistent with our results is Fisher’s geometrical model. This result indicates that species complexity, as well as distance of the population to the fitness optimum, modulated by long-term population size, are the key drivers of the fitness effects of new amino acid mutations. Other factors, like protein stability and mutational robustness, do not play a dominant role.

In the title of the paper itself is something that would have been alien to Fisher’s understanding when he formulated his geometric model: the term “nonsynonymous” to refer to mutations which change the amino acid corresponding to the triplet codon. The paper is understandably larded with terminology from the post-DNA and post-genomic era, and yet comes to the conclusion that a nearly blind statistical geneticist from about a century ago correctly adduced the nature of mutation’s affects on fitness in organisms.

The authors focused on two primary species which different histories, but well characterized in the evolutionary genomic literature: humans and Drosophila. The models they tested are as follows:

 

Basically they checked the empirical distribution of the site frequency spectra (SFS) of the nonsynonymous variants against expected outcomes based on particular details of demographics, which were inferred from synonymous variation. Drosophila have effective population sizes orders of magnitude larger than humans, so if that is not taken into account, then the results will be off. There are also a bunch of simulations in the paper to check for robustness of their results, and they also caveat the conclusion with admissions that other models besides the Fisherian one may play some role in their focal species, and more in other taxa. A lot of this strikes me as accruing through the review process, and I don’t have the time to replicate all the details to confirm their results, though I hope some of the reviewers did so (again, I suspect that the reviewers were demanding some of these checks, so they definitely should have in my opinion).

In the Fisherian model more complex organisms are more fine-tuned due topleiotropy and other such dynamics. So new mutations are more likely to deviate away from the optimum. This is the major finding that they confirmed. What does “complex” mean? The Drosophila genome is less than 10% of the human genome’s size, but the migratory locust has twice as large a genome as humans, while wheat has a sequence more than five times as large. But organism to organism, it does seem that Drosophila has less complexity than humans. And they checked with other organisms besides their two focal ones…though the genomes there are not as complete presumably.

As I indicated above, the authors believe they’ve checked for factors such as background selection, which may confound selection coefficients on specific mutations. The paper is interesting as much for the fact that it illustrates how powerful analytic techniques developed in a pre-DNA era were. Some of the models above are mechanistic, and require a certain understanding of the nature of molecular processes. And yet they don’t seem as predictive as a more abstract framework!

Citation: Christian D. Huber, Bernard Y. Kim, Clare D. Marsden, and Kirk E. Lohmueller, Determining the factors driving selective effects of new nonsynonymous mutations PNAS 2017 ; published ahead of print April 11, 2017, doi:10.1073/pnas.1619508114

November 13, 2013

The color of life as a coincidence

Filed under: Anthroplogy,Evolution,Evolutionary Genetics,Genetics of taste,Taste — Razib Khan @ 12:35 am

Credit: Eric Hunt

Credit: Eric Hunt

I do love me some sprouts! Greens, bitters, strong flavors of all sorts. I’ve always been like this. Some of this is surely environment. My family comes from a part of South Asia known for its love of bracing and bold sensation. But perhaps I was born this way? There’s a fair amount of evidence that taste has a substantial genetic component. This does not mean genes determine what one tastes, but it certainly opens the door for passive gene-environment correlations. If you do not find a flavor offensive, you are much more likely to explore it depths, and cultivate your palette.

220px-Durio_kutej_F_070203_ime

Dost thou dare?
Credit: W.A. Djatmiko

And of course I’m not the only one with a deep interest in such questions. With the marginal income available to us many Americans have become “foodies,” searching for flavor bursts and novelties which their ancestors might never have been able to comprehend. More deeply in a philosophical sense the question of qualia reemerges if there is a predictable degree of inter-subjectivity in taste perception (OK, qualia is always there, though scientific sorts tend to view it as intractable in a fundamental sense).


But there’s heritability, and then there’s genes. We know that perception in some ways is heritable, but what is perhaps more interesting is if you can peg a specific genomic location to it. Then the evolutionary story becomes all the richer. And so it is with the locus TAS2R16, where a nonsynonymous mutation at location 516 seems to result in heightened sensitivity to bitter tastes. More specifically, it’s rs846664, and the derived T allele is fixed outside of Africa, while the ancestral G allele still segregates at appreciable fractions within African populations. A new paper in Molecular Biology and Evolution puts this locus under a microscope, though it does not come up with any clear conclusions. Origin and Differential Selection of Allelic Variation at TAS2R16 Associated with Salicin Bitter Taste Sensitivity in Africa presents some interesting findings. First, let’s look at the distribution of the variation in their sample populations at the SNP of most particular interest:

Region Population T516G
Outside of Africa Non-Africans 0.000
Ethiopia Semitic 0.059
Tanzania Sandawe 0.083
Ethiopia Omotic 0.093
Ethiopia Cushitic 0.095
Tanzania Iraqw 0.111
West Central Africa Fulani 0.114
Kenya Niger-Kordofanian 0.133
Ethiopia Nilo-Saharan 0.156
Kenya Afroasiatic 0.162
West Central Africa Niger-Kordofanian 0.214
Kenya Nilo-Saharan 0.225
Kenya Luo 0.250
Central Africa Niger-Kordofanian 0.329
Tanzania Hadza 0.333
Central Africa Bulala 0.361
Central Africa Nilo-Saharan 0.367
West Central Africa Afroasiatic 0.462
West Central Africa Nilo-Saharan 0.500

As you can see T is fixed outside of Africa, and varies across many African populations  Previous work implied this, though coverage within Africa was not good. One thing to observe though is that the frequency of A within Africa can not be explained by recent Eurasian admixture. The frequency is way too high for that to be the sole explanation, and in any case there is no evidence that ~33% of the Hadza’s ancestry is of Eurasian provenance (the Hadza being one of the three major groups of African hunter-gatherers, along with the Bushmen and Pygmies).

Within the paper the authors resequenced ~1,000 base pairs across diverse African populations in an exonic region of this gene (the stuff that codes for amino acids). What they discovered is that of the SNPs segregating, 516 in particular was critical toward effecting phenotyping change. Not only did individuals with the T variant notably exhibit stronger bitter sensitivity, but in vitro expression with a reporter was elevated. Because they had such a dense genomic region they could perform various nucleotide based tests to detect natural selection, and, attempt coalescent models to infer genealogical history.

I’m going to spare you some of the gory details at this point. Here’s what they found. First, it does look like the region is under natural selection in many African populations, in particular, the derived haplotype with T at 516 at the center. But this result is not reproduced across all tests. The coalescent simulations make clear why: the mutation is an old variant with deep roots in the hominin lineage. In other words this variation pre-dates H. sapiens. It looks like the T allele has rapidly increased in frequency relatively recently, though more on the order of ~50,000 years, rather than ~10,000.* Basically around the time of the “Out of Africa” event. Additionally, there’s a tell-tale sign that this is being subject to selection within Africa: the genetic differences across populations at TAS2R16 far exceed the genome-wide values (the Fst at this locus is in the top 1% of loci within the African genome). Finally, one should note that the G allele haplotypes seem to be much more strongly constrained, as if they’re under purifying selection. This means that the switch to T is not all gain.

At this point you may be ready for a story about how some African populations, like Eurasians, underwent a lifestyle change, and diet changes resulted in a shift in sensory perception. That does not seem to be the story. Rather, the authors did not seem to be able to agree upon a neat explanation for what is driving these recent sweeps up from ancient standing genetic variation. They do observe that the variation does tend to cluster geographically, more so than the genome-wide results would imply. There’s likely some adaptation going on, they simply don’t know what. In the introduction and elsewhere you can see that variation at TAS2R16 does correlate with other traits. Not too surprising due to the relatively ubiquity of pleiotropy; one gene with many effects.

Stepping outside of the implications of this specific result, let’s think about what might be a takeaway: something as essential as taste perception might be a side effect of other aspects of evolutionary processes. In other words, we don’t know what the phenotypic target of selection is in this case, but we do have a good handle one of the major side effects, which is sensory perception. How one taste seems like a big deal.** Andthere have been many theories propounded that variation in bitter sensitivity is due to adaptation to poisonous plants and such, but really no one knew, and that was just the most plausible of low hanging fruit. With these results from Africa, where there is more variation in the trait and genes, and good geographic coverage, that seems to be an implausible model to adhere to (one would think the hunter-gatherer Hadza would exhibit the most sensitivity, no?). Many of the traits and tendencies which we humans see as fundamental, essential, and of great import, many actually be side effects of powerful evolutionary forces hammering at the genetic-correlation matrices which define the hidden network of co-dependencies within the genome. So there, I said it. Life is an accident. Enjoy it.

Citation: Campbell, Michael C., et al. “Origin and Differential Selection of Allelic Variation at TAS2R16 Associated with Salicin Bitter Taste Sensitivity in Africa.” Molecular biology and evolution (2013): mst211.

* If it was closer to ~10,000 I think haplotype based tests would come back with something, but they do not.

** Some Epicureans might be accused of reducing the good to taste!

The post The color of life as a coincidence appeared first on Gene Expression.

The color of life as a coincidence

Filed under: Anthroplogy,Evolution,Evolutionary Genetics,Genetics of taste,Taste — Razib Khan @ 12:35 am

Credit: Eric Hunt

Credit: Eric Hunt

I do love me some sprouts! Greens, bitters, strong flavors of all sorts. I’ve always been like this. Some of this is surely environment. My family comes from a part of South Asia known for its love of bracing and bold sensation. But perhaps I was born this way? There’s a fair amount of evidence that taste has a substantial genetic component. This does not mean genes determine what one tastes, but it certainly opens the door for passive gene-environment correlations. If you do not find a flavor offensive, you are much more likely to explore it depths, and cultivate your palette.

220px-Durio_kutej_F_070203_ime

Dost thou dare?
Credit: W.A. Djatmiko

And of course I’m not the only one with a deep interest in such questions. With the marginal income available to us many Americans have become “foodies,” searching for flavor bursts and novelties which their ancestors might never have been able to comprehend. More deeply in a philosophical sense the question of qualia reemerges if there is a predictable degree of inter-subjectivity in taste perception (OK, qualia is always there, though scientific sorts tend to view it as intractable in a fundamental sense).


But there’s heritability, and then there’s genes. We know that perception in some ways is heritable, but what is perhaps more interesting is if you can peg a specific genomic location to it. Then the evolutionary story becomes all the richer. And so it is with the locus TAS2R16, where a nonsynonymous mutation at location 516 seems to result in heightened sensitivity to bitter tastes. More specifically, it’s rs846664, and the derived T allele is fixed outside of Africa, while the ancestral G allele still segregates at appreciable fractions within African populations. A new paper in Molecular Biology and Evolution puts this locus under a microscope, though it does not come up with any clear conclusions. Origin and Differential Selection of Allelic Variation at TAS2R16 Associated with Salicin Bitter Taste Sensitivity in Africa presents some interesting findings. First, let’s look at the distribution of the variation in their sample populations at the SNP of most particular interest:

Region Population T516G
Outside of Africa Non-Africans 0.000
Ethiopia Semitic 0.059
Tanzania Sandawe 0.083
Ethiopia Omotic 0.093
Ethiopia Cushitic 0.095
Tanzania Iraqw 0.111
West Central Africa Fulani 0.114
Kenya Niger-Kordofanian 0.133
Ethiopia Nilo-Saharan 0.156
Kenya Afroasiatic 0.162
West Central Africa Niger-Kordofanian 0.214
Kenya Nilo-Saharan 0.225
Kenya Luo 0.250
Central Africa Niger-Kordofanian 0.329
Tanzania Hadza 0.333
Central Africa Bulala 0.361
Central Africa Nilo-Saharan 0.367
West Central Africa Afroasiatic 0.462
West Central Africa Nilo-Saharan 0.500

As you can see T is fixed outside of Africa, and varies across many African populations  Previous work implied this, though coverage within Africa was not good. One thing to observe though is that the frequency of A within Africa can not be explained by recent Eurasian admixture. The frequency is way too high for that to be the sole explanation, and in any case there is no evidence that ~33% of the Hadza’s ancestry is of Eurasian provenance (the Hadza being one of the three major groups of African hunter-gatherers, along with the Bushmen and Pygmies).

Within the paper the authors resequenced ~1,000 base pairs across diverse African populations in an exonic region of this gene (the stuff that codes for amino acids). What they discovered is that of the SNPs segregating, 516 in particular was critical toward effecting phenotyping change. Not only did individuals with the T variant notably exhibit stronger bitter sensitivity, but in vitro expression with a reporter was elevated. Because they had such a dense genomic region they could perform various nucleotide based tests to detect natural selection, and, attempt coalescent models to infer genealogical history.

I’m going to spare you some of the gory details at this point. Here’s what they found. First, it does look like the region is under natural selection in many African populations, in particular, the derived haplotype with T at 516 at the center. But this result is not reproduced across all tests. The coalescent simulations make clear why: the mutation is an old variant with deep roots in the hominin lineage. In other words this variation pre-dates H. sapiens. It looks like the T allele has rapidly increased in frequency relatively recently, though more on the order of ~50,000 years, rather than ~10,000.* Basically around the time of the “Out of Africa” event. Additionally, there’s a tell-tale sign that this is being subject to selection within Africa: the genetic differences across populations at TAS2R16 far exceed the genome-wide values (the Fst at this locus is in the top 1% of loci within the African genome). Finally, one should note that the G allele haplotypes seem to be much more strongly constrained, as if they’re under purifying selection. This means that the switch to T is not all gain.

At this point you may be ready for a story about how some African populations, like Eurasians, underwent a lifestyle change, and diet changes resulted in a shift in sensory perception. That does not seem to be the story. Rather, the authors did not seem to be able to agree upon a neat explanation for what is driving these recent sweeps up from ancient standing genetic variation. They do observe that the variation does tend to cluster geographically, more so than the genome-wide results would imply. There’s likely some adaptation going on, they simply don’t know what. In the introduction and elsewhere you can see that variation at TAS2R16 does correlate with other traits. Not too surprising due to the relatively ubiquity of pleiotropy; one gene with many effects.

Stepping outside of the implications of this specific result, let’s think about what might be a takeaway: something as essential as taste perception might be a side effect of other aspects of evolutionary processes. In other words, we don’t know what the phenotypic target of selection is in this case, but we do have a good handle one of the major side effects, which is sensory perception. How one taste seems like a big deal.** Andthere have been many theories propounded that variation in bitter sensitivity is due to adaptation to poisonous plants and such, but really no one knew, and that was just the most plausible of low hanging fruit. With these results from Africa, where there is more variation in the trait and genes, and good geographic coverage, that seems to be an implausible model to adhere to (one would think the hunter-gatherer Hadza would exhibit the most sensitivity, no?). Many of the traits and tendencies which we humans see as fundamental, essential, and of great import, many actually be side effects of powerful evolutionary forces hammering at the genetic-correlation matrices which define the hidden network of co-dependencies within the genome. So there, I said it. Life is an accident. Enjoy it.

Citation: Campbell, Michael C., et al. “Origin and Differential Selection of Allelic Variation at TAS2R16 Associated with Salicin Bitter Taste Sensitivity in Africa.” Molecular biology and evolution (2013): mst211.

* If it was closer to ~10,000 I think haplotype based tests would come back with something, but they do not.

** Some Epicureans might be accused of reducing the good to taste!

The post The color of life as a coincidence appeared first on Gene Expression.

December 21, 2012

The causes of evolutionary genetics

A few days ago I was browsing Haldane’s Sieve,when I stumbled upon an amusing discussion which arose on it’s “About” page. This “inside baseball” banter got me to thinking about my own intellectual evolution. Over the past few years I’ve been delving more deeply into phylogenetics and phylogeography, enabled by the rise of genomics, the proliferation of ‘big data,’ and accessible software packages. This entailed an opportunity cost. I did not spend much time focusing so much on classical population and evolutionary genetic questions. Strewn about my room are various textbooks and monographs I’ve collected over the years, and which have fed my intellectual growth. But I must admit that it is a rare day now that I browse Hartl and Clark or The Genetical Theory of Natural Selection without specific aim or mercenary intent.

R. A. Fisher

Like a river inexorably coursing over a floodplain, with the turning of the new year it is now time to take a great bend, and double-back to my roots, such as they are. This is one reason that I am now reading The Founders of Evolutionary Genetics. Fisher, Wright, and Haldane, are like old friends, faded, but not forgotten, while Muller was always but a passing acquaintance. But ideas 100 years old still have power to drive us to explore deep questions which remain unresolved, but where new methods and techniques may shed greater light. A study of the past does not allow us to make wise choices which can determine the future with any certitude, but it may at least increase the luminosity of the tools which we have iluminate the depths of the darkness. The shape of nature may become just a bit less opaque through our various endeavors.

Figure from “Directional Positive Selection on an Allele of Arbitrary Dominance”, Teshima KM, Przeworski M

So what of this sieve of Haldane? As noted at  Haldane’s Sieve the concept is simple. Imagine two mutations, one which expresses a trait in a recessive fashion, and another in a dominant one. The sieve operates by favoring the emergence out of the low frequency zone where stochastic forces predominate of dominantly expressing variants (i.e., even if an allele confers a large fitness benefit, at low frequencies the power of random chance may still imply that it is highly likely to go extinct). An example of this would be lactase persistence, which in the modal  Eurasian variant seems to exhibit dominance. The converse case, where beneficial mutations are recessive in expression suffer from a structural problem where their benefit is more theoretical than realized.

The mathematics of this is exceedingly simple, a consequence of the Hardy-Weinberg dynamics of diploid random mating organisms. Let’s use the gene which is implicated in variation in lactase persistence as an example, LCT. Consider two alleles, LP and LNP, where the former confers persistence (one can digest lactose sugar as an adult), and the latter manifests the conventional mammalian ‘wild type’ (the production of lactase ceases as one leaves the life stage when nursing is feasible). LP is clearly the novel mutant. In a small population it is not unimaginable that by random chance the frequency of LP rises to ~10%. What now? At HWE you have:

p2 + 2pq + q2 = 1, where q = LP allele. At ~10% the numbers substituted would be:

(0.90)2 + 2(0.90)(0.10) + (0.10)2

This is where dominance or recessive expression is highly relevant. The reality is that LP is a dominant trait. So in this population the frequency of LP as a trait would be:

(0.10)2 + 2(0.90)(0.10) = 19%

Now imagine a model where LP is favored, but it expresses in a recessive fashion. Then the frequency of the trait would equal q2, the homozygote LP-allele proportion. That is, 1%. Though population genetics is often constructed on an algebraic foundation, the results lend themselves to intuition. A structural parameter endogenous to the genetic system, dominant or recessive expression, can have longstanding consequences in terms of the likely trajectory of the alleles. Selection only “sees” the trait, so a recessive trait with sterling qualities may as well be a trait with no qualities. In contrast, a dominantly expressed allele can cut like a scythe through a population, because every copy “counts.”

In preparation for this post I revisited the selection on Haldane’s Sieve in the encyclopediac Elements of Evolutionary Genetics. The authors note that this phenomenon, though of vintage character as these things can be reckoned is a field as young as evolutionary genetics, is still a live one. The dominance of favored mutations in wild populations, or the recessive character of deleterious ones in laboratory stock, may reflect the different regimes which these two genes pools are subject to. The nature of things is such that is easier to generate recessive mutations than dominant ones (i.e., loss is easier than gain), so the preponderance of dominant variants in wild stocks subject to positive selective pressure lends credence to the idea that evolutionary rather than development forces and constraints shape the genetic character of many species.

And yet things are not quite so tidy. Haldane’s Sieve, and the framework of dominant versus recessive alleles, operates differently in the area of sex chromosomes. In many lineages there is a ‘heterogametic sex’ which carries only one functional chromosome for most of the genome. In mammals this is the male (XY), while in birds this is the female (ZW). As males have only one functional copy of most genes on the sex chromosome, the masking effect of recessive expression does not apply to them in mammals. This may imply that because of the exposure of many deleterious recessive variants to natural selection within the heterogametic sex one would see different allelic distributions and genetic landscapes on these chromosomes (e.g., more rapid adaptation because of the exposure of nominally recessive alleles in the heterogametic sex, as well as more purifying selection on deleterious variants). But the reality is more complex, and the literature in this area is somewhat muddled. More precisely, it seems phylogenetically sensitive. Validation of the theory in mammals founders once one moves to Drosphila.

And that is why research in evolutionary genetics continues. The theory stimulates empirical exploration, and is tested against it. Much of the formal theory of classical evolutionary genetics, which crystallized in the years before World War II, is now gaining renewed relevance because of empirical testability in the era of big data and big computation. This is an domain where the past is not simply of interest to historians. Scientists themselves, chasing the next grant, and producing the expected stream of publications, may benefit from a little historical perspective by standing upon the shoulders of giants.

October 21, 2012

Buddy can you spare a selective sweep

The Pith: Natural selection comes in different flavors in its genetic constituents. Some of those constituents are more elusive than others. That makes “reading the label” a non-trivial activity.

As you may know when you look at patterns of variation in the genome of a given organism you can make various inferences from the nature of these patterns. But the power of those inferences is conditional on the details of the real demographic and evolutionary histories, as well as the assumptions made about the models one which is testing. When delving into the domain of population genomics some of the concepts and models may seem abstruse, but the reality is that such details are the stuff of which evolution is built. A new paper in PLoS Genetics may seem excessively esoteric and theoretical, but it speaks to very important processes which shape the evolutionary trajectory of a given population. The paper is titled Distinguishing between Selective Sweeps from Standing Variation and from a De Novo Mutation. Here’s the author summary:

Considerable effort has been devoted to detecting genes that are under natural selection, and hundreds of such genes have been identified in previous studies. Here, we present a method for extending ...

October 15, 2012

Don’t trust an archaeologist about genetics, don’t trust a geneticist about archaeology

Filed under: Anthroplogy,Evolutionary Genetics,Human Evolution — Razib Khan @ 1:38 pm

Who to trust? That is the question when you don’t know very much (all of us). Trust is precious, and to some extent sacred. That’s why I can flip out when I realize after the fact that someone more informed than me in field X sampled biased their argument in a way they knew was shady to support a proposition they were forwarding. What’s the point of that? Who cares if you win at a particular bull-session? You’re burning through cultural capital. And not that most of my interlocutors care, but I’m likely to never trust them again on anything.

In any case, this came to mind when I ran across a James Fallows’ post at The Atlantic. Here’s a screenshot of the appropriate section, with my underlines:


The PNAS link is wrong. The correspondent is actually linking to an article in Quaternary International. And they do point out that there are possible problems with draft quality sequences due to contamination. But I didn’t find the paper too persuasive. There are two issues. First, the Denisova genome is very good quality. So you can be more ...

October 2, 2012

What is going on with plant domestication?

Filed under: Domestication,Evolutionary Genetics — Razib Khan @ 11:44 pm

PNAS has a paper on barley domestication out right now. It is nicely open access, so read it yourself, and come right back! I have to admit that I did not like the paper too much. It seemed to derive far too many conclusions from a few rudimentary (for today at least) phylogenetic methods. In particular I’m very skeptical of the idea that there are two barely lineages here which diverged ~3 million years B.P. But this isn’t particularly strange when it comes to the phylogenetic origins of cultivars. There have been long debates about whether there was one origin for rice, or several. Setting aside my major issues with this paper I wonder if perhaps our expectations and prejudices derived from the fact that animals are to a great extent the “null” organisms are muddying our interpretation of results from plants. The number of loci here seem sufficient to dismiss the possibility of introgression, but I’m not sure that the rate of evolution across these markers is quite so clock-like.

In any case, to understand domestication, and I suspect human evolution, these results from plants are going to have to be cleared up and systematized. Illumination would be helpful, but until then ...

September 16, 2012

What the substrate tells

Filed under: Evolution,Evolutionary Genetics,Genetics,Genomics — Razib Khan @ 7:26 pm

One of the weird things about genetics is that it encompasses both the abstract and the concrete. The formal and physical. You can talk to a geneticist who is mostly interested in details of molecular mechanisms, and is steeped in structural biology. For these people genes are specific and material things. In contrast there are other geneticists who focus more on genes as units of analysis. In this case genes are semantic labels for the mediators within an intersection of phenomena. Recall that genetics predates the knowledge of its concrete substrate by 50 years! By the 1920s Mendelian genetics had been fused with evolutionary biology to create a systematic framework in which we could understand the patterns of inheritance across the generations. In the 1950s the DNA revolution was upon us, but as W. D. Hamilton recalls this had only a minimal impact on the evolutionary genetic thinkers of the era. With the Lewontin and Hubby allozyme paper in the mid-1960s this sort of benign disciplinary evasion was no longer possible; the field of molecular evolution came into its own.*

Today with genomics these human-imposed artificialities are fading away. Consider the concept of genetic recombination. Originally an ...

Nature’s Oracle finally out in 2013

Filed under: Evolutionary Genetics,Nature's Oracle,W. D. Hamilton — Razib Khan @ 1:03 pm

Jerry Coyne alerts me to the fact that Ullica Segerstrale’s Nature’s Oracle: A Life of W. D. Hamilton is finally near publication. Specifically, early 2013. Coyne has looked at he pre-publication text, so it is probably in revision, though the meat has already been laid upon the bones. Hamilton was one of the preeminent evolutionary biologists of the second half of the 20th century. Though to my knowledge he never wrote an autobiography as such the details of his life was liberally strewn out across dozens of books. You can find them in Segerstrale’s Defenders of the Truth: The Sociobiology Debate, or The Darwin Wars. He makes a cameo appearance in Robert Trivers’ Natural Selection and Social Theory, as well as The Price of Altruism, a scientific biography of Hamilton’s collaborator George Price.

But the best place to go for understanding Hamilton as he understood himself are his collected papers, which have biographical sections laying out the scientific, cultural, and historical context for a given publication. They are, in chronological order Narrow Roads of Gene Land: The Collected Papers of W. D. Hamilton Volume 1: Evolution of Social Behaviour, Narrow Roads of Gene ...

September 12, 2012

An ontology of genetic diversity

Filed under: Evolution,Evolutionary Genetics — Razib Khan @ 11:23 pm

Implicit in the title The Origin Of Species is the question: why the plural? In other words, why isn’t there a singular apex species which dominates this planet? One can imagine an abstract system where natural selection slowly but gradually sifts through variation and designs a best-of-all-replicators. And yet on the contrary it seems that our planet has exhibited an overall tendency of going from lower to higher diversity. The age of stromatolites may be the last epoch when we had the best-of-all-replicators.


These sorts of deep questions about variation drive many of the research projects in evolutionary biology. Often one focuses on a narrow zone of interest. An organism for example which might serve as an illustrative model for more general processes. Or, a particular dynamic which interlocks with other processes to form a whole phenomenon. But on occasion you have to sit and ponder the whole shebang. Why genetic diversity? More specifically, why not more diversity of genetic diversity? The issue here is what is sometimes termed Lewontin’s paradox.

Consider two populations. One population goes through an extreme bottleneck, while the other maintains a large population over the generations. What would you presume in regards to ...

July 8, 2012

The wages of a life science Ph.D. (not high!)

Filed under: Culture,Evolutionary Genetics,Graduate School — Razib Khan @ 5:31 pm

A few people have emailed me about this article in The Washington Post, U.S. pushes for more scientists, but the jobs aren’t there. Other people cover this area well (for example), so I’m not going to say much. But first, ignore the article in the paper, and read the original survey which the article is based on: Science & Engineering Labor Force.


What the newspaper article added in terms of value was interviewing a small number of people. This is fine I suppose, but it adds no real substantive value, because you can’t really obtain a representative sample. Additionally, if you look at the employment data in the PDF I link to above you see that though things aren’t peachy for Ph.D.s, they are often far better than for people with less education. In other words you can’t just compare a science Ph.D. to some idealized full-employment world with 100% job satisfaction. In the real world everyone has to hustle now, and often it is better to hustle with a doctorate than not. What the PDF attached does illustrate is that the cost of forgone wages probably hits life science Ph.D.s in particular. The perpetual postdoc ...

June 25, 2012

Sleeping like a Neandertal

Forgot to highlight one of the coolest abstracts from SMBE 2012, A genomewide map of Neandertal ancestry in modern humans:

2. The map allows us to identify Neandertal alleles that have been the target of selection since introgression. We identified over 100 regions in which the frequency of Neandertal ancestry is extremely unlikely under a model of neutral evolution. The highest frequency region on chromosome 4 has a frequency of Neandertal ancestry of about 85% in Europe and overlaps CLOCK, a key gene in Circadian function in mammals. The high frequency, Neandertal-derived variant is specific to Europeans; it is not very common in East Asians. This gene has been found in other selection scans in Eurasian populations, but has never before been linked to Neandertal gene flow


One of the predictions of assimilation of a large intrusive population with a small but long endemic population is that there will be biased representation of adaptive alleles from the latter into the former. In other words, if genome-wide admixture is on the order of 5% from the latter into the former, alleles which confer local fitness benefits will be present in the descendants of the asymmetric admixture in proportions of out ...

Sleeping like a Neandertal

Forgot to highlight one of the coolest abstracts from SMBE 2012, A genomewide map of Neandertal ancestry in modern humans:

2. The map allows us to identify Neandertal alleles that have been the target of selection since introgression. We identified over 100 regions in which the frequency of Neandertal ancestry is extremely unlikely under a model of neutral evolution. The highest frequency region on chromosome 4 has a frequency of Neandertal ancestry of about 85% in Europe and overlaps CLOCK, a key gene in Circadian function in mammals. The high frequency, Neandertal-derived variant is specific to Europeans; it is not very common in East Asians. This gene has been found in other selection scans in Eurasian populations, but has never before been linked to Neandertal gene flow


One of the predictions of assimilation of a large intrusive population with a small but long endemic population is that there will be biased representation of adaptive alleles from the latter into the former. In other words, if genome-wide admixture is on the order of 5% from the latter into the former, alleles which confer local fitness benefits will be present in the descendants of the asymmetric admixture in proportions of out ...

April 8, 2012

What do you think about group selection?

Filed under: Evolutionary Genetics,Group Selection — Razib Khan @ 8:50 pm

I just received a review copy of E. O. Wilson’s The Social Conquest of Earth. One of the reasons why this book is “hot” is that Wilson has recently been revisiting the “levels of selection” debates, and significantly downgraded kin selection in the pantheon of evolutionary dynamics (at least in his mind). There has been a lot of talk on the blogs about Wilson’s ideas, in large part because of his partisan position on the Nowak vs. most other biologists debate, in favor of Nowak.

I don’t know if I’ll have time to review the book (a reality I honestly explained already to the people working at the publisher), but, it did get me thinking: what are the opinions of biologists in relation group selection? My personal experience is that opinions actually vary by discipline and by department. It’s hard to get a real sense, because people tend to be in their own “bubble.” With that in mind, I’ve put together a small survey to assess opinions. My core audience here are people who consider themselves biologists, though I can’t prevent someone with strong opinions from participating obviously!

So, a survey on group selection. You can see the ...

Another look at mtDNA

The new article in The American Journal of Human Genetics, A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root, is open access, so you should check it out. The discussion gets to the heart of the matter:

Supported by a consensus of many colleagues and after a few years of hesitation, we have reached the conclusion that on the verge of the deep-sequencing revolution…when perhaps tens of thousands of additional complete mtDNA sequences are expected to be generated over the next few years, the principal change we suggest cannot be postponed any longer: an ancestral rather than a “phylogenetically peripheral” and modern mitogenome from Europe should serve as the epicenter of the human mtDNA reference system. Inevitably, the proposed change could raise some temporary inconveniences. For this reason, we provide tables and software to aid data transition.

What we propose is much more than a mere clerical change. We use the Ptolemaian geocentric versus Copernican heliocentric systems as a metaphor. And the metaphor extends further: as the acceptance of the heliocentric system circumvented epicycles in the orbits of planets, switching the mtDNA reference to an ancestral RSRS will end an academically inadmissible conjuncture where virtually all mitochondrial genome ...

January 16, 2012

The milkmen

Dienekes and Maju have both commented on a new paper which looked at the likelihood of lactase persistence in Neolithic remains from Spain, but I thought I would comment on it as well. The paper is: Low prevalence of lactase persistence in Neolithic South-West Europe. The location is on the fringes of the modern Basque country, while the time frame is ~3000 BC. Table 3 shows the major result:

Lactase persistence is a dominant trait. That means any individual with at least one copy of the T allele is persistent. As Maju noted a peculiarity here is that the genotypes are not in Hardy-Weinberg Equilibrium. Specifically, there are an excess of homozygotes. Using the SJAPL location as a potentially random mating scenario you should expect ~7 T/C genotypes, not 2. Interestingly the persistent individual in the Longar location also a homozygote.


HWE makes a few assumptions. For example, no selection, migration, mutation, or assortative mating. Deviation from HWE is suggestive of one of these dynamics. The sample size here is small, but the deviation is not to be dismissed. Recall that lactase persistence has dominant inheritance patterns. If the trait was being positively selected for you would only need one copy. The enrichment of homozygotes is unexpected if selection in situ is occurring here. It can not be ruled out that one is observing the admixture of two distinct populations. One generation of random mating would generate HWE, but when populations hybridize in realistic scenarios this is not always a plausible assumption. Rather, assortative mating often persists over the generations, slowing down the diminishing of population substructure.

Stepping back from speculation in this case what can we say? First, the LCT locus has a large mutational target. The trait of lactase persistence has arisen multiple times via different mutational events across the Old World. But, there does seem to be one particular variant which is found from Spain to Northern India. There is some circumstantial evidence that the allele had its origin somewhere in Central Eurasia, but currently its modal frequency is in Northern Europe, Scandinavia and Germany. The region in the genome around this mutation is characterized by a very long haplotype. It is one of the most definitive loci as a candidate for natural selection in the human genome. There is now a fair amount of ancient DNA evidence that lactase persistence in Europe is a feature of the last ~5,000 years or so. Among the modern Basques the frequency of the allele is 66 percent.

For me the key issue is teasing apart the role of migration and selection in each specific case. It does not seem to be correct that the frequency of the -13910T LCT allele in Basques and Punjabis is reflective of the frequency of recent common ancestry. That implies that natural selection is at work at this locus. On the other hand, the haplotype which is present in both the Basque and Punjabis is likely to be descended from a common set of individuals, implying that there is a genealogical chain connecting these two very distinct and distant Eurasian populations. Therefore, we can potentially make some inferences about the power of migration in spreading distinctive alleles. Often we partition selection from genealogical information, because selection so often serves to distort the signal. But the genealogical patterns may lay at the heart of the distribution of different natural selective events at the LCT locus.

Overall, I would say that the results from ancient DNA are disordering and clouding simple elegant models. One hopes and presumes that as sample sizes increase in this domain we’ll start to see more clarity as new paradigms crystallize.

Citation: European Journal of Human Genetics, 10.1038/ejhg.2011.254

January 8, 2012

James F. Crow profile

Filed under: Evolutionary Genetics,James F. Crow — Razib Khan @ 10:28 pm

The University of Madison-Wisconsin, has a long piece up on the late James F. Crow. Much recommended.

Older Posts »

Powered by WordPress