Razib Khan One-stop-shopping for all of my content

December 21, 2012

The causes of evolutionary genetics

A few days ago I was browsing Haldane’s Sieve,when I stumbled upon an amusing discussion which arose on it’s “About” page. This “inside baseball” banter got me to thinking about my own intellectual evolution. Over the past few years I’ve been delving more deeply into phylogenetics and phylogeography, enabled by the rise of genomics, the proliferation of ‘big data,’ and accessible software packages. This entailed an opportunity cost. I did not spend much time focusing so much on classical population and evolutionary genetic questions. Strewn about my room are various textbooks and monographs I’ve collected over the years, and which have fed my intellectual growth. But I must admit that it is a rare day now that I browse Hartl and Clark or The Genetical Theory of Natural Selection without specific aim or mercenary intent.

R. A. Fisher

Like a river inexorably coursing over a floodplain, with the turning of the new year it is now time to take a great bend, and double-back to my roots, such as they are. This is one reason that I am now reading The Founders of Evolutionary Genetics. Fisher, Wright, and Haldane, are like old friends, faded, but not forgotten, while Muller was always but a passing acquaintance. But ideas 100 years old still have power to drive us to explore deep questions which remain unresolved, but where new methods and techniques may shed greater light. A study of the past does not allow us to make wise choices which can determine the future with any certitude, but it may at least increase the luminosity of the tools which we have iluminate the depths of the darkness. The shape of nature may become just a bit less opaque through our various endeavors.

Figure from “Directional Positive Selection on an Allele of Arbitrary Dominance”, Teshima KM, Przeworski M

So what of this sieve of Haldane? As noted at  Haldane’s Sieve the concept is simple. Imagine two mutations, one which expresses a trait in a recessive fashion, and another in a dominant one. The sieve operates by favoring the emergence out of the low frequency zone where stochastic forces predominate of dominantly expressing variants (i.e., even if an allele confers a large fitness benefit, at low frequencies the power of random chance may still imply that it is highly likely to go extinct). An example of this would be lactase persistence, which in the modal  Eurasian variant seems to exhibit dominance. The converse case, where beneficial mutations are recessive in expression suffer from a structural problem where their benefit is more theoretical than realized.

The mathematics of this is exceedingly simple, a consequence of the Hardy-Weinberg dynamics of diploid random mating organisms. Let’s use the gene which is implicated in variation in lactase persistence as an example, LCT. Consider two alleles, LP and LNP, where the former confers persistence (one can digest lactose sugar as an adult), and the latter manifests the conventional mammalian ‘wild type’ (the production of lactase ceases as one leaves the life stage when nursing is feasible). LP is clearly the novel mutant. In a small population it is not unimaginable that by random chance the frequency of LP rises to ~10%. What now? At HWE you have:

p2 + 2pq + q2 = 1, where q = LP allele. At ~10% the numbers substituted would be:

(0.90)2 + 2(0.90)(0.10) + (0.10)2

This is where dominance or recessive expression is highly relevant. The reality is that LP is a dominant trait. So in this population the frequency of LP as a trait would be:

(0.10)2 + 2(0.90)(0.10) = 19%

Now imagine a model where LP is favored, but it expresses in a recessive fashion. Then the frequency of the trait would equal q2, the homozygote LP-allele proportion. That is, 1%. Though population genetics is often constructed on an algebraic foundation, the results lend themselves to intuition. A structural parameter endogenous to the genetic system, dominant or recessive expression, can have longstanding consequences in terms of the likely trajectory of the alleles. Selection only “sees” the trait, so a recessive trait with sterling qualities may as well be a trait with no qualities. In contrast, a dominantly expressed allele can cut like a scythe through a population, because every copy “counts.”

In preparation for this post I revisited the selection on Haldane’s Sieve in the encyclopediac Elements of Evolutionary Genetics. The authors note that this phenomenon, though of vintage character as these things can be reckoned is a field as young as evolutionary genetics, is still a live one. The dominance of favored mutations in wild populations, or the recessive character of deleterious ones in laboratory stock, may reflect the different regimes which these two genes pools are subject to. The nature of things is such that is easier to generate recessive mutations than dominant ones (i.e., loss is easier than gain), so the preponderance of dominant variants in wild stocks subject to positive selective pressure lends credence to the idea that evolutionary rather than development forces and constraints shape the genetic character of many species.

And yet things are not quite so tidy. Haldane’s Sieve, and the framework of dominant versus recessive alleles, operates differently in the area of sex chromosomes. In many lineages there is a ‘heterogametic sex’ which carries only one functional chromosome for most of the genome. In mammals this is the male (XY), while in birds this is the female (ZW). As males have only one functional copy of most genes on the sex chromosome, the masking effect of recessive expression does not apply to them in mammals. This may imply that because of the exposure of many deleterious recessive variants to natural selection within the heterogametic sex one would see different allelic distributions and genetic landscapes on these chromosomes (e.g., more rapid adaptation because of the exposure of nominally recessive alleles in the heterogametic sex, as well as more purifying selection on deleterious variants). But the reality is more complex, and the literature in this area is somewhat muddled. More precisely, it seems phylogenetically sensitive. Validation of the theory in mammals founders once one moves to Drosphila.

And that is why research in evolutionary genetics continues. The theory stimulates empirical exploration, and is tested against it. Much of the formal theory of classical evolutionary genetics, which crystallized in the years before World War II, is now gaining renewed relevance because of empirical testability in the era of big data and big computation. This is an domain where the past is not simply of interest to historians. Scientists themselves, chasing the next grant, and producing the expected stream of publications, may benefit from a little historical perspective by standing upon the shoulders of giants.

December 18, 2012

Buddy, can you spare some ascertainment?

The above map shows the population coverage for the Geno 2.0 SNP-chip, put out by the Genographic Project. Their paper outlining the utility and rationale by the chip is now out on arXiv. I saw this map last summer, when Spencer Wells hosted a webinar on the launch of Geno 2.0, and it was the aspect which really jumped out at me. The number of markers that they have on this chip is modest, only >100,000 on the autosome, with a few tens of thousands more on the X, Y, and mtDNA. In contrast, the Axiom® Genome-Wide Human Origins 1 Array Plate being used by Patterson et al. has ~600,000 SNPs. But as is clear by the map above Geno 2.0 is ascertained in many more populations that the other comparable chips (Human Origins 1 Array uses 12 populations). It’s obvious that if you are only catching variation on a few populations, all the extra million markers may not give you much bang for the buck (not to mention the biases that that may introduce in your population genetic and phylogenetic inferences).

To the left are the list of populations against which the Human Origins 1 Array was ascertained, and they look rather comprehensive to me. In contrast, for Geno 2.0 ‘ancestrally informative markers’ were ascertained on 450 populations. The ultimate question for me is this: is all the extra ascertainment on diverse and obscure groups worth it? On first inspection Geno 2.0′s number of SNPs looks modest as I stated, but in my experience when you quality control and merge different panels together you are often left with only a few hundred thousand SNPs in any case. 100-200,000 SNPs is also sufficient to elucidate relationships even in genetically homogeneous regions such as Europe in my experience (it’s more than enough for model-based clustering, and seems to be overkill for MDS or PCA). One issue that jumps out at me about the Affymetrix chip is that it is ascertained toward the antipodes. In contrast, Geno 2.0 takes into account the Eurasian heartland. I suspect, for example, that Geno 2.0 would be better for population or ancestry assignment for South Asians because it would have more informative markers for those populations.

Ultimately I can’t really say much more until I use both marker sets in different and similar contexts. Since Geno 2.0 consciously excludes many functional and medically relevant SNPs its utility is primarily in the domain of demographics and history. If the populations in question are well covered by the Human Origins 1 Array, I see no reason why one shouldn’t go with it. Not only does it have more information about biological function, but the number of markers are many fold greater. On the other hand, Geno 2.0 may be more useful on the “blank zones” of the Affy chip. Hopefully the Genographic Project results paper for Geno 2.0 will come out soon and I can pull down their data set and play with it.

Cite: arXiv:1212.4116

Unveiling the genealogical lattice

To understand nature in all its complexity we have to cut down the riotous variety down to size. For ease of comprehension we formalize with math, verbalize with analogies, and visualize with representations. These approximations of reality are not reality, but when we look through the glass darkly they give us filaments of essential insight. Dalton’s model of the atom is false in important details (e.g., fundamental particles turn out to be divisible into quarks), but it still has conceptual utility.

Likewise, the phylogenetic trees popularized by L. L. Cavalli-Sforza in The History and Geography of Human Genes are still useful in understanding the shape of the human demographic past. But it seems that the bifurcating model of the tree must now be strongly tinted by the shades of reticulation. In a stylized sense inter-specific phylogenies, which assume the approximate truth of the biological species concept (i.e., little gene flow across lineages), mislead us when we think of the phylogeny of species on the microevolutionary scale of population genetics. On an intra-specific scale gene flow is not just a nuisance parameter in the model, it is an essential phenomenon which must be accommodated into the framework.

This is on my mind because of the emergence of packages such as TreeMix and AdmixTools. Using software such as these on the numerous public data sets allows one to perceive the reality of admixture, and overlay lateral gene flow upon the tree as a natural expectation. But perhaps a deeper result is the character of the tree itself is torn asunder. The figure above is from a new paper, Efficient moment-based inference of admixture parameters and sources of gene flow, which debuts MixMapper. The authors bring a lot of mathematical heft to their exposition, and I can’t say I follow all of it (though some of the details are very similar to Pickrell et al.’s). But in short it seems that in comparison to TreeMix MixMapper allows for more powerful inference of a narrower set of populations, selected for exploring very specific questions. In contrast, TreeMix explores the whole landscape with minimal supervision. Having used the latter I can testify that that is true.

The big result from MixMapper is that it extends the result of Patterson et al., and confirms that modern Europeans seem to be an admixture between a “north Eurasian” population, and a vague “west Eurasian” population. Importantly, they find evidence of admixture in Sardinians, which implies that Patterson et al.’s original were not sensitive to admixture in putative reference populations (note that Patterson is a coauthor on this paper as well). The rub, as noted in the paper, is that it is difficult to estimate admixture when you don’t have “pure” ancestral reference populations. And yet here the takeaway for me is that we may need to rethink our whole conception of pure ancestral populations, and imagine a human phylogenetic tree as a series of lattices in eternal flux, with admixed nodes periodically expanding so as to generate the artifice of a diversifying tree. The closer we look, the more likely that it seems that most of the populations which have undergone demographic expansion in the past 10,000 years are also the products of admixture. Any story of the past 10,000 years, and likely the past 100,000 years, must give space at the center of the narrative arc lateral gene flow across populations.

Cite: arXiv:1212.2555 [q-bio.PE]

November 11, 2012

The Genographic Project’s Scientific Grants Program

While I was at Spencer Wells’ poster at ASHG I was primarily curious about bar plots. He’s got really good spatial coverage, so I’m moderately excited about the paper (though I didn’t see much explicit testing of phylogenetic hypotheses, which I think this sort of paper has to do now; we’re beyond PCA and bar plots only papers). That being said, Spencer was more interested in me promoting the Scientific Grants Program. Here’s some more information:

The Genographic Project’s Scientific Grants Program awards grants on a rolling basis for projects that focus on studying the history of the human species utilizing innovative anthropological genetic tools. The variety of projects supported by the scientific grants will aim to construct our ancient migratory and demographic history while developing a better understanding of the phylogeographic structure of world populations. Sample research topics could include subjects like the origin and spread of the Indo-European languages, genetic insights into Papua New Guinea’s high linguistic diversity, the number and routes of migrations out of Africa, the origin of the Inca, or the genetic impact of the spread of maize agriculture in the Americas.

Recipients will typically be population geneticists, students, linguists, and other researchers or scientists interested ...

Reflections on the evolution at ASHG 2012

As most readers know I was at ASHG 2012. I’m going to divide this post in half. First, the generalities of the meeting. And second, specific posters, etc.


- Life Technologies/Ion Torrent apparently hires d-bag bros to represent them at conferences. The poster people were fine, but the guys manning the Ion Torrent Bus were total jackasses if they thought it would be funny/amusing/etc. Human resources acumen is not always a reflection of technological chops, but I sure don’t expect organizational competence if they (HR) thought it was smart to hire guys who thought (the d-bags) it would be amusing to alienate a selection of conference goers at ASHG. Go Affy & Illumina!

- Speaking of sequencing, there were some young companies trying to pitch technologies which will solve the problem of lack of long reads. I’m hopeful, but after the Pacific Biosciences fiasco of the late 2000s, I don’t think there’s a point in putting hopes on any given firm.

- I walked the poster hall, read the titles, and at least skimmed all 3,000+ posters’ abstracts. No surprise that genomics was all over the place. But perhaps a moderate ...

August 28, 2012

Evolutionary & population genetics preprints – Haldane’s Sieve

OK, perhaps I can help with that. Dr. Coop speaks of the collaboration between himself & Dr. Joseph Pickrell, Haldane’s Sieve, which I added to my RSS days ago (and you can see me pushing it to my Pinboard). From the “About”:

As described above, most posts to Haldane’s Sieve will be basic descriptions of relevant preprints, with little to no commentary. All posts will have comment sections where discussion of the papers will be welcome. A second type of post will be detailed comments on a preprint of particular interest to a contributor. These posts could take the style of a journal review, or may simply be some brief comments. We hope they will provide useful feedback to the authors of the preprint. Finally, there will be posts by authors of preprints in which they describe their work and place it in broader context.

We ask the commenters to remember that by submitting articles to preprint servers the authors (often biologists) are taking a somewhat unusual step. Therefore, comments should be phrased in a constructive manner to aid the authors.

It might be helpful if other evolution/genetics bloggers ...

August 14, 2012

Neanderthal admixture & the ecology of academe

Yesterday I pointed out that David Reich had a moderately dismissive attitude toward the new paper in PNAS, Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins. Here’s what Reich said:

…But Reich believes that the discussion would have been different if it had happened in the open. The PNAS paper questioning the Neanderthal admixture addresses issues swirling around two years ago, but not Reich and Slatkin’s latest work. “It’s been an issue for several years. They were right to work on this,” says Reich. But now, “it’s kind of an obsolete paper,” he says.

Here’s what Nick Patterson, Reich’s colleague told me via email:

Ancient structure in Africa was considered when we wrote the Green et al. paper, and we were aware that this could explain D-statistics. But the hypothesis is no longer viable as the major explanation of Neandertal genetics in Eurasia. This was discussed in the recent paper of Yang et al. (MBE, 2012). (Not referenced by the PNAS paper).

A very simple argument, that convinces me, is that the allelic frequency spectrum of Neandertal alleles in Eurasia falls off very quickly. A bottleneck flattens out the ...

June 24, 2012

SMBE 2012

Dienekes has summaries up of human-related abstracts of Society for Molecular Biology & Evolution 2012.

1) Remember these are not papers, and some of the abstracts may never become papers, at least in recognizable form

2) Speaking of which, Estimating a date of mixture of ancestral South Asian populations:

Linguistic and genetic studies have demonstrated that almost all groups in South Asia today descend from a mixture of two highly divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners and Europeans, and Ancestral South Indians (ASI) not related to any populations outside the Indian subcontinent. ANI and ASI have been estimated to have diverged from a common ancestor as much as 60,000 years ago, but the date of the ANI-ASI mixture is unknown. Here we analyze data from about 60 South Asian groups to estimate that major ANI-ASI mixture occurred 1,200-4,000 years ago. Some mixture may also be older—beyond the time we can query using admixture linkage disequilibrium—since it is universal throughout the subcontinent: present in every group speaking Indo-European or Dravidian languages, in all caste levels, and in primitive tribes. After the ANI-ASI mixture that occurred ...

April 29, 2012

Pygmies: “old” populations, and a new “look” (?)

Over the years one issue that crops up repeatedly in human evolutionary genetics and paleoanthropology (or more precisely, the popular exposition of the topics in the media) is the idea that is that “population X are the most ancient Y.” X will always refer to a population within a larger set, Y, which is defined by relative marginalization or retention of older cultural folkways. So, for example, I have seen it said that the Andaman Islanders are the “most ancient Asian population.” Why? The standard model for a while now has been that non-Africans derive from a line of Africans which left the ancestral continent 50 to 100 thousand years ago, and began to diversify. Presumably Andaman Islanders have ancestry which goes back to this original dispersion, just as Europeans and Chinese do (revisions which suggest that Aboriginals may have been part of an earlier wave, still put the Andamanese in the second wave). The reason that the Andaman populations are termed ancient is pretty straightforward: they’re Asia’s last hunter-gatherers, literally chucking spears at outsiders. An ancient lifestyle gets conflated with ancient genetics.

This is a much bigger problem with the ...

April 8, 2012

Another look at mtDNA

The new article in The American Journal of Human Genetics, A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root, is open access, so you should check it out. The discussion gets to the heart of the matter:

Supported by a consensus of many colleagues and after a few years of hesitation, we have reached the conclusion that on the verge of the deep-sequencing revolution…when perhaps tens of thousands of additional complete mtDNA sequences are expected to be generated over the next few years, the principal change we suggest cannot be postponed any longer: an ancestral rather than a “phylogenetically peripheral” and modern mitogenome from Europe should serve as the epicenter of the human mtDNA reference system. Inevitably, the proposed change could raise some temporary inconveniences. For this reason, we provide tables and software to aid data transition.

What we propose is much more than a mere clerical change. We use the Ptolemaian geocentric versus Copernican heliocentric systems as a metaphor. And the metaphor extends further: as the acceptance of the heliocentric system circumvented epicycles in the orbits of planets, switching the mtDNA reference to an ancestral RSRS will end an academically inadmissible conjuncture where virtually all mitochondrial genome ...

March 26, 2012

The evolution of the human face

The face is an important aspect of our phenotype. So important that facial recognition is one of many innate reflexive cognitive competencies. By this, I mean that you can recognize a face in a gestalt manner, just like you can recognize a set of three marbles. You don’t have to think about it in a step-by-step fashion. Particular types of brain injuries can actually result in disablement of this faculty, and a minority of humans seem to lack it altogether at birth (prosopagnosia). That’s why I’ve long been interested in the genetic architecture and evolution of craniofacial traits. I long ago knew the potential range of pigmentation phenotypes for my daughter because both her parents have been genotyped, but when it comes to facial features we’re stuck with the old ‘blending inheritance’ heuristic. The most obvious importance of teasing apart the genetic architecture of craniofacial traits is forensics. It might not put the sketch artist out of a job, but it would be an excellent supplement to problematic eye witness reports.

But it isn’t just forensics. The issue has evolutionary relevance. It looks like that in terms of morphology our own lineage has had a lot of diversity up until recently. I’m thinking in particular of the ‘archaic’ looking humans recently discovered in China and Nigeria, who seem to have persisted down into the Holocene. More generally, humans as a whole have become more gracile over the last 10,000 years. Why? There are two extreme answers we can look to. First, gracile humans have replaced robust humans. Second, natural selection for gracility has resulted in the in situ evolution of many populations over the last ~10,000 years. An interesting aspect of this is that it looks as if many salient traits have been targets of selection, and therefore evolution and population differentiation.

Here the top 10 SNPs which deviate from the overall phylogenetic tree of population relationships in the HGDP data set:


SNP Chr Nearest gene Phenotype
rs1834640 15 SLC24A5 skin pigmentation
rs260690 2 EDAR hair morphology
rs10882168 10 CYP26A1/FER1L3 ?
rs4918664 10 CYP26A1/FER1L3 ?
rs2250072 15 SLC24A5 skin pigmentation
rs6583859 10 CYP26A1/FER1L3 ?
rs2384319 2 KIF3C ?
rs6500380 16 LONP2 ?
rs4497887 2 CNTNAP5 ?
rs9809818 3 FOXP1 ?

There are two things I want to say off the bat. First, a given SNP likely has many phenotypic effects. So the trait that we “see” in terms of its effect may not be the same trait that natural selection “sees.” Second, it is not a surprise that out of the traits that a given variant may affect the physically salient ones stand out; sometimes you do go looking where the light is shining on a dark street. We know that the lighter complexion of East and West Eurasians seems to be due to independent evolutionary events. In other words, they aren’t derived from common ancestry. When it comes to hair form the EDAR locus seems to be responsible for the distinctive characteristics of East Asians, and has been under recent selection.

What does all this have to do with craniofacial traits? Simple: the coarse and “skin deep” traits that physical anthropologists used decades ago to classify human beings have been rather informative to a first approximation of both details of phylogeny and natural selection. I see no reason why craniofacial traits should be any different. Humans have become more gracile, and some human populations seem to have been changing rather rapidly. I am highly skeptical that this is a neutral process. We care a great deal about facial features, and deviation from the norm can be arresting. If there has been change it is either due to population replacement, or selection (it could be a correlated response, or direct selection).

It is with that preamble that I offer up Mark Shriver’s abstract at the Modern Human Genetic Variation symposium:

The genes determining normal-range variation in human faces are arguably some of the most intrinsically interesting and fastest evolving. However, so far, little work has been focused on discovering these genes. Working under the hypothesis that genes causing Mendelian craniofacial dysmorphologies also may be important in determining normal-range facial-feature variation, and that those genes associated with population differences in facial features should have experienced greater levels of evolution (change in allele frequency), we have taken an admixture mapping/selection scan approach to identifying and studying the genes directly affecting facial features. We have applied the methods of automated quasi-landmark analyses, partial least squares regression, and individual genomic ancestry estimates to explore the distribution of facial features across two groups of human populations — West Africans and Europeans. Using three samples of admixed subjects (American; N=159, Brazilian; N=197, and Cape Verdean; N=248) we have modeled facial variation in the parental populations and compared the extent to which estimates of ancestry from the face compare to genomic-ancestry estimates. We also have tested six selection-nominated craniofacial candidate genes for functional effects on facial features using admixture mapping. In objective tests, two of these six genes (FGFR1 and TRPS1) show significant effects on facial features. In addition, human-observer ratings of the similarity between subjects and allele-specific facial morphs show the same effects for these two genes. Additionally, exaggerated allele-specific morphs based on normal-range variation in these genes recapitulates the syndromic facies of the craniofacial dysmorphologies with which they are associated.

I asked Mark about the nature of these genes and the traits. The paper is coming soon, but he told me that he does not think that the genetic architecture of craniofacial traits is going be as simple or easy to characterize as pigmentation genes. On the other hand, he’s reportedly capturing 35% of the African vs. European difference with his marker set, so that’s not trivial, and some of the individual loci have a strong enough effect that it’s visible by eye! Also, given the preserved extant diversity within populations (pigmentation genes are often disjoint across Africans and Europeans) he believes that the selection events are recent.

January 24, 2012

When Eve met Creb

The excellent site io9 has a piece up today which is a fascinating indicator of the nature of popular science publications as a lagging indicator. It is a re-post of a piece published last April, How Mitochondrial Eve connected all humanity and rewrote human evolution. In it you have an encapsulation of a particular period in our understanding of human natural history through evolutionary genetics. Notice for example the focus on maternally transmitted lineages, mtDNA and Y chromosomes. And the citations on genealogy date to the middle aughts. The science is mostly correct as far as it goes in the details (or at least it is defensible, last I checked there was still debate as to the validity of the molecular clocks used for Y chromosomal lineages), but it misses the big picture of how we’ve reframed our understanding of the human past over the last few years. The distance between 2011 and 2009 is far greater in this sense than between 2009 and 1999 (or even 2009 and 1989!). The io9 piece is a reflection of the era before the paradigmatic rupture.

We are no longer talking just about African mtDNA Eve and her husband Y chromosomal Adam. I’m going to consciously avoid the term “revolutionize,” because the broad outlines of the old story certainly hold. Rather, as we are wont to do it seems that we became a bit too bold with some of our brush strokes, and elided fascinating and subtle elements of the landscape on the margins. There were Crebs, and other assorted Oogas and Boogas. And the painting is not completed yet. As such we can’t really draw any conclusions as to “what it all means,” aside from the fact that it’s fascinating.

Addendum: Someone in the comments observes in relation to a depiction of Eve in the story that “She’s awfully pale for an East African.” This is true on the merits, but the logic is kind of dumb. Why exactly do we think that people ~150,000 years ago looked anything like modern East Africans? It is very likely that Europeans ~35,000 years ago did not look like Daryl Hannah.

January 22, 2012

How the Amhara breathe differently

I have blogged about the genetics of altitude adaptation before. There seem to be three populations in the world which have been subject to very strong natural selection, resulting in physiological differences, in response to the human tendency toward hypoxia. Two of them are relatively well known, the Tibetans and the indigenous people of the Andes. But the highlanders of Ethiopia have been less well studied, nor have they received as much attention. But the capital of Ethiopia, Addis Ababa, is nearly 8,000 feet above sea level!

Another interesting aspect to this phenomenon is that it looks like the three populations respond to adaptive pressures differently. Their physiological response varies. And the more recent work in genomics implies that though there are similarities between the Asian and American populations, there are also differences. This illustrates the evolutionary principle of convergence, where different populations approach the same phenotypic optimum, though by somewhat different means. To my knowledge there has not been as much investigation of the African example. Until now. A new provisional paper in Genome Biology is out, Genetic adaptation to high altitude in the Ethiopian highlands:

We highlight several candidate genes for involvement in high-altitude adaptation in Ethiopia, including CBARA1, VAV3, ARNT2 and THRB. Although most of these genes have not been identified in previous studies of high-altitude Tibetan or Andean population samples, two of these genes (THRB and ARNT2) play a role in the HIF-1 pathway, a pathway implicated in previous work reported in Tibetan and Andean studies. These combined results suggest that adaptation to high altitude arose independently due to convergent evolution in high-altitude Amhara populations in Ethiopia.

The main shortcoming about this paper for me is that it does not highlight the evolutionary history of this adaptation. In the paper the authors compared the Amhara (a highland population) to nearby lowland populations. But did not explore the nature of the population structure and how it might have influenced the arc of adaptation. Are these very ancient adaptations? Or new ones? It seems that hominins have been resident in Ethiopian for millions of years. If this is so presumably there have been adaptations to higher elevations from time immemorial. But what if these adaptations are new?

More pointedly the Ethiopians can be modeled as a compound of an Arabian population with an indigenous East African one. If this is a genuine recent admixture event, then one might be able to ascertain via haplotype structure whether the adaptive variants derive from ancient African genetic variation, or whether they’re novel mutations. It seems that this paper is a good first step, but there’s a lot more to see here….

Citation: Genome Biology, doi:10.1186/gb-2012-13-1-r1

Image credit: Wikipedia

January 16, 2012

The milkmen

Dienekes and Maju have both commented on a new paper which looked at the likelihood of lactase persistence in Neolithic remains from Spain, but I thought I would comment on it as well. The paper is: Low prevalence of lactase persistence in Neolithic South-West Europe. The location is on the fringes of the modern Basque country, while the time frame is ~3000 BC. Table 3 shows the major result:

Lactase persistence is a dominant trait. That means any individual with at least one copy of the T allele is persistent. As Maju noted a peculiarity here is that the genotypes are not in Hardy-Weinberg Equilibrium. Specifically, there are an excess of homozygotes. Using the SJAPL location as a potentially random mating scenario you should expect ~7 T/C genotypes, not 2. Interestingly the persistent individual in the Longar location also a homozygote.

HWE makes a few assumptions. For example, no selection, migration, mutation, or assortative mating. Deviation from HWE is suggestive of one of these dynamics. The sample size here is small, but the deviation is not to be dismissed. Recall that lactase persistence has dominant inheritance patterns. If the trait was being positively selected for you would only need one copy. The enrichment of homozygotes is unexpected if selection in situ is occurring here. It can not be ruled out that one is observing the admixture of two distinct populations. One generation of random mating would generate HWE, but when populations hybridize in realistic scenarios this is not always a plausible assumption. Rather, assortative mating often persists over the generations, slowing down the diminishing of population substructure.

Stepping back from speculation in this case what can we say? First, the LCT locus has a large mutational target. The trait of lactase persistence has arisen multiple times via different mutational events across the Old World. But, there does seem to be one particular variant which is found from Spain to Northern India. There is some circumstantial evidence that the allele had its origin somewhere in Central Eurasia, but currently its modal frequency is in Northern Europe, Scandinavia and Germany. The region in the genome around this mutation is characterized by a very long haplotype. It is one of the most definitive loci as a candidate for natural selection in the human genome. There is now a fair amount of ancient DNA evidence that lactase persistence in Europe is a feature of the last ~5,000 years or so. Among the modern Basques the frequency of the allele is 66 percent.

For me the key issue is teasing apart the role of migration and selection in each specific case. It does not seem to be correct that the frequency of the -13910T LCT allele in Basques and Punjabis is reflective of the frequency of recent common ancestry. That implies that natural selection is at work at this locus. On the other hand, the haplotype which is present in both the Basque and Punjabis is likely to be descended from a common set of individuals, implying that there is a genealogical chain connecting these two very distinct and distant Eurasian populations. Therefore, we can potentially make some inferences about the power of migration in spreading distinctive alleles. Often we partition selection from genealogical information, because selection so often serves to distort the signal. But the genealogical patterns may lay at the heart of the distribution of different natural selective events at the LCT locus.

Overall, I would say that the results from ancient DNA are disordering and clouding simple elegant models. One hopes and presumes that as sample sizes increase in this domain we’ll start to see more clarity as new paradigms crystallize.

Citation: European Journal of Human Genetics, 10.1038/ejhg.2011.254

January 14, 2012

Reconstructing a generation unsampled

In the near future I will be analyzing the genotype of an individual where all four grandparents have been typed. But this got me thinking about my own situation: is there a way I could “reconstruct” my own grandparents? None of them are living. The easiest way to type them would be to obtain tissue samples from hospitals. This is not totally implausible, though in this case these would be Bangladeshi hospitals, so they might not have saved samples or even have a good record of hem. Another way would be to extract DNA from the burial site. This is not necessarily palatable. But assuming you did this, if you have access to a forensic lab it might be pretty easy (though I think most forensic labs using VNTRs, rather than SNP chips, so I don’t know if they’d touch every chromosome), I’m not sure that the quality would be optimal for more vanilla typing operations, especially for older samples which are likely to be contaminated with a lot of bacteria.

For me the simplest option is to look at relatives. Each of my grandparents happens to have had siblings, so there are many sets of relatives related to just each of those individuals of interest. I also have many cousins, so pooling all the genotypes together and using the information of a pedigree one could ascertain which chromosomal segments are likely to derive from a particular grandparent. To give a concrete example, my mother has a maternal cousin to whom she is quite close. By typing my mother and her cousin one could infer that the segments shared across the two individuals derive from the common maternal grandparents. Of course there’s a problem that cousins have a coefficient of relatedness of only 1/8th, so there is going to be a lot of information missing. But, if you had lots of cousins you could presumably reconstruct the genotypes far better.


But what if you didn’t have any of this? I came up with a crazy idea, and I want to throw it out there to see how crazy it is. The issue from the perspective of you, the indivdual without grandparental information, is that for either your mother or your father you don’t know which homologous chromosomes come from which parent (your grandparents, their parents). As it happens, everyone has a male parent and a female parent. So if you can assign a a chromosomal region as having come from the male, and another as having come from the female, then you can reconstruct some of your grandparents’ genotypes because you know their sexes. How can you make this determination?

Genomic imprinting. This is a phenomenon where genes from a given parent, often of a particular sex, are expressed, while those of the other sex are repressed (often it manifests in terms of methylation or lack of methylation). Therefore, if you have a gene, A, which is usually expressed if inherited from a male parent and repressed if it is inherited from a female parent then the state of that gene within a chromosomal region can be a “tag” for the sex of the parent of origin. With enough of these imprinted genes you can create a mosaic of the genome of the individual in terms of sex of origin. Obviously genomic regions from different sexes are from different parents. If you have enough children of these two parents you should be able to infer the whole genomes of these individuals.

The big reason this probably won’t work is that there just aren’t enough imprinted genes in the human genome. But what do readers think?

January 13, 2012

Between the desert and the sea

Zinedine Zidane, a Kabyle

There is a new paper in PLoS Genetics out which purports to characterize the ancestry of the populations of northern Africa in greater detail. This is important. The HGDP data set does have a North African population, the Mozabites, but it’s not ideal to represent hundreds of millions of people with just one group. The first author on this new paper is Brenna Henn, who was also first author on another paper with a diverse African data set. Importantly the data was posted online. Unfortunately though most of the populations didn’t have too many markers. This isn’t an issue in an of itself, but it becomes a big deal when trying to combine it with other data sets. If you limit the markers to those which intersect across two data sets you start to thin them down a lot, to the point where they’re not useful. Though the the results of the paper are worth talking about, the authors claim that they’ll be putting the data online. This is important because they used a large number of markers, so the intersections will be nice (I can, for example, envisage exploring the relationship between the North Africans and the IBS Iberian sample in the near future).

As for the paper itself, Genomic Ancestry of North Africans Supports Back-to-Africa Migrations:

Proposed migrations between North Africa and neighboring regions have included Paleolithic gene flow from the Near East, an Arabic migration across the whole of North Africa 1,400 years ago (ya), and trans-Saharan transport of slaves from sub-Saharan Africa. Historical records, archaeology, and mitochondrial and Y-chromosome DNA have been marshaled in support of one theory or another, but there is little consensus regarding the overall genetic background of North African populations or their origin and expansion. We characterize the patterns of genetic variation in North Africa using ~730,000 single nucleotide polymorphisms from across the genome for seven populations. We observe two distinct, opposite gradients of ancestry: an east-to-west increase in likely autochthonous North African ancestry and an east-to-west decrease in likely Near Eastern Arabic ancestry. The indigenous North African ancestry may have been more common in Berber populations and appears most closely related to populations outside of Africa, but divergence between Maghrebi peoples and Near Eastern/Europeans likely precedes the Holocene (>12,000 ya). We also find significant signatures of sub-Saharan African ancestry that vary substantially among populations. These sub-Saharan ancestries appear to be a recent introduction into North African populations, dating to about 1,200 years ago in southern Morocco and about 750 years ago into Egypt, possibly reflecting the patterns of the trans-Saharan slave trade that occurred during this period.

The model outline here is straightforward:

- A population of West Eurasian provenance migrated across the fringe of the southern Mediterranean >10,000 years B.P. (Maghrebi)

- This was later overlain by a later West Asian migration (Near Eastern)

- A third major element here seems to be Sub-Saharan African admixture, which these authors claim is rather new (post-Roman)

Two of the methods used will be familiar to readers of this weblog. They used ADMIXTURE to generate barplots which fractionate putative ancestral components given K number of components. Second, they also use PCA to visualize the largest components genetic variation within the samples on a plane.

As you “move up” the K’s you note that Maghrebi populations “split” from the Near Eastern reference, the Qataris. This is supported by the PCA, which shows that there is a dimension of variation which separates Near Easterners & Europeans from Maghrebis. The authors note that this dimension is orthogonal to the Sub-Saharan African vs. Eurasian component. That suggests that the putative Maghrebi component is likely to be part of the set of “Out of Africa” populations, rather than an African population which simply experienced continuous gene flow with West Eurasians.

They also estimate a Fst, a statistic which partitions genetic variation within and between groups. The value between Sub-Saharan Africans and Europeans is ~0.15 using HGDP SNP data, and between Europeans and East Asians ~0.10.  Using the Tuscans and Qataris as European and West Asian references against the North African populations along their east-west cline they estimate Fsts from ~0.03 to ~0.06. The higher end values are from populations which are less admixed with Near Eastern elements, and the colored polygons illustrate the domain generated by ADMIXTURE Fsts across inferred ancestral components. You also see in the chart estimated time of divergence. I won’t get into the assumptions in the model, but the authors do note that ~12,000 years B.P. seems to be the low bound estimate for when the Maghbrebis diverged from other West Eurasians. This is important, because it predates agriculture.

The final set of methods outlined in this paper looked at ancestry on a more fine-grained genomic scale. To the left you see a plot where each horizontal bar represents an individual’s chromosome 1 (among a set of North Africans). Each color in that bar indicates a component of ancestry (except the black, which are centromeres). This sort of information is important, because saying someone is 50% X and 50% Y summarizes information to the point of eliding it. An individual who is a first generation product of a Chinese-European marriage is going to have the same ancestral proportions as someone who is a Uyghur for those respective populations. But a fine-scale mapping of the genomic ancestry would look very different, because the history of the admixture is very different.

There are many inferences in the paper which I won’t address. Rather, let me focus on this one assertion:

After accounting for putative recent admixture (Figure 1), the indigenous Maghrebi component (k-based) is estimated to have diverged from Near Eastern/Europeans between 18–38 Kya (Figure 3), under a range of Ne and k values. We hence suggest that the ancestral Maghrebi population separated from Near Eastern/Europeans prior to the Holocene, and that the Maghrebi populations do not represent a large-scale demic diffusion of agropastoralists from the Near East.

This is not implausible on the face of it. The component of ancestry modal in the Mozabite HGDP sample tends to have a relatively high Fst in relation to other West Eurasian groups. I had wondered if this was due to ancient Sub-Saharan African admixture which had produced a particular stabilized hybrid, but these results indicate that the component is no closer than other West Eurasians. What I’m confused and skeptical about are the range of divergence times which different papers are producing which seem somewhat implausible taken together.

There are papers which posit that East Asians separated from Europeans ~25,000 years B.P. This is in the same range as the divergence between Maghrebis and West Eurasians, but the Maghrebi genetic distance (Fst) is about 1/2 as great. Also, these sets of results which generate a “bunching” together of the separation of many extant non-African lineages in the 20-40,000 year range imply very rapid differentiation after the “Out of Africa” event, if that event did occur ~50,000 years ago (at least for most Eurasians, even assuming a revised model whereby Australian Aboriginals derive from an earlier wave). One at a time any given divergence estimate may be broadly plausible, but the literature is just not particularly coherent on this matter, and it often seems archaeologically implausible.

Citation: Henn BM , Botigué LR , Gravel S , Wang W , Brisbin A , et al. 2012 Genomic Ancestry of North Africans Supports Back-to-Africa Migrations. PLoS Genet 8(1): e1002397. doi:10.1371/journal.pgen.1002397

Image Credit: Raphaël Labbé

December 26, 2011

The sons of Adam: spirit, not blood

Hominin increase in cranial capacity, courtesy of Luke Jostins

A few years ago a statistical geneticist at Cambridge’s Sanger Institute, Luke Jostins, posted the chart above using data from fossils on cranial capacity of hominins (the human lineage). As you can see there was a gradual increase in cranial capacity until ~250,000 years before the present, and then a more rapid increase. I should also note that from what I know about the empirical data, mean human cranial capacity peaked around the Last Glacial Maximum. Our brains have been shrinking, even relative to our body sizes (we’re not as large as we were during the Ice Age). But that’s neither here nor there. In the comments Jostins observes:

The data above includes all known Homo skulls, but none of the results change if you exclude the 24 Neandertals. In fact, you see the same results if you exclude Sapiens but keep Neandertals; the trends are pan-Homo, and aren’t confined to a specific lineage….

In other words: the secular increase in cranial capacity for our lineage extends millions of years back into the past, and also shifts laterally to “side-branches” (with our specific terminal node, H. sapiens sapiens, as a reference). This is why I often contend as an aside that humanity was to some extent inevitable. By humanity I do not mean H. sapiens sapiens, the descendants of a subset of African hominins who flourished ~100,000 years before the present, but intelligent and cultural hominins who would inevitably construct a technological civilization. The parallel trends across the different distinct branches of the hominin family tree which Luke Jostins observed indicated to me that our lineage was not special, but simply first. That is, if African hominins were exterminated by aliens ~100,000 years before the present, at some point something akin to H. sapiens sapiens in creativity and rapidity of cultural production would eventually arise (in all likelihood later, but possibly earlier!).

This does not mean that I think humanity was inevitable upon earth. For most of the history of this planet life was unicellular. I do not find it implausible that life on earth may have reached its “sell by” date due to astronomical events before the emergence of complex organisms (in fact, from what I have heard the end of life is going to occur ~1 billion years into the future due to the persistent increase in the energy output of Sol, not ~4 billion years in the future when Sol turns into a red giant). But, once complex organisms arose it does seem that further complexity was inevitable. This was Richard Dawkins’ case in The Ancestor’s Tale based simply on the descriptive record. But did the emergence of complex organisms necessarily entail the evolution of a technological species? I don’t think so. It took 500 million years for that to occur (it does not seem that coal resources formed hundreds of millions of years ago were tapped before humans). Given enough time obviously a technological species would evolve (e.g., extend the time of evaluation to 1 trillion years), but note that the earth has only ~5 billion years. Homo arrived on the scene in the last 20% of that interval.

Here I am positing at a minimum two not excessively likely or inevitable events over a 5 billion year time span which would lead to a hyper-technological and cultural species:

- The emergence of multicellular life

- The emergence of a lineage with the propensities of Homo

One Homo evolved and expanded outside of Africa I suspect that something of the form of a technological civilization became inevitable n this planet. We see parallelism in our own short post-Pleistocene epoch. Multiple human societies shifted from hunter-gatherers to agriculturalists over the past 10,000 years. The experience of the New World civilizations in particular illustrates that human universal tendencies are real. Not only were “game changing” cultural forms such as agriculture and literacy invented independently during the Holocene, but they were not invented during earlier interglacials (at least in all likelihood).

Khufu, Necho, Augustus and Napoleon

Why not? Well, consider the cultural torpidity of Paleolithic toolkits, which might persist for hundreds of thousands of years! I suspect some of this due to biology. But even over the Holocene we do perceive that cultural change has proceeded at a more rapid clip as time has progressed (i.e., at a minimum cultural change has been accelerating, and it may be that the rate of acceleration itself is increasing!). Consider that the civilization of ancient Egypt spanned at least 2,000 years. Though there are clear differences, the continuity between Old Kingdom Egypt and the last dynasties before the Assyrian and Persian conquests is very obvious to us, and would be obvious to ancient Egyptians. In contrast, 2,000 years separates us from Augustan Rome. The continuities here are clear as well (e.g., the Roman alphabet), but the cultural change is also clear (if you wish to argue that the early modern and modern period are sui generis, the 1,500 year interval from Augustan Rome to the Neo-Classical Renaissance would still be a stark contrast when compared against an ancient Egyptian reference*, despite the latter’s aping of the forms of the former).

So far I have focused on the vertical dimension of time. But there is also the lateral dimension, of cross-fertilization across the branches of the hominin family tree. The admixture of a Neanderthal element into non-Africans has started to become widely accepted recently, thanks to the confluence of archaeology and genomics in the field of ancient DNA. Even if one rejects the viability of Neanderthal admixture, the solution to the conundrum of these results must still entail stepping away from a simple model of recent exclusive origin of humans from a small African population. There are also hints of admixture with other archaic lineages on the Pacific fringe, and within Africa.

Until recently it was common to posit that modern humans, our own lineage, had some special genius which allowed it to sweep the field and extinguish our cousins. The qualitative result of Luke Jostins’ plot was known; that other hominin lineages also exhibited encephalization. In fact, it was a curious fact that Neanderthals on average had larger cranial capacities than anatomically modern humans. But the reality remained that we replaced them, ergo, we must have a special genius. Until the lack of distinction between Neanderthals and modern humans on loci implicated in the necessary (if not sufficient) competency of language that trait was a prime candidate for what made “us” special. But now I put “us” in quotation marks. The data do point to an overwhelming descent from an African or near-African population for non-Africans over the past 100,000 years. But the “archaic admixture” is not trivial. What was they are us, and we have become what they might have been.

For over two centuries there has been a debate in the West between monogenesis and polygenesis. The former is the position that humankind derives from one single pair or population (the former a straightforward recapitulation of the standard Abrahamic model). The latter is the position that different races of humans derive from different proto-humans, or, for the Christian polygenists that only Europeans descent from Adam and Eve (the other races being “non-Adamic”). Echoes of this conflict persist down to the present era. Many of the earlier partisans of “Out of Africa” have claimed that the proponents of multiregionalism were latter-day polygenists (not without total justification in some cases).

But the conflict between monogenism and polygenism is not the appropriate frame for what is being unveiled by reality before our eyes. What we see in the creation of modern humanity is a monogenic base inflected with the flavors of polygenism. Modern humans descend, by and large, from an expansion of an African population over the past 200,000 years. But on the margins there are other strands and filaments of ancestry which tie disparate populations back to lineages which branched off far earlier from the main trunk. At a minimum hundreds of thousands, and perhaps an order of 1 million years, before our own age. Today genomics avails of us the statistical power to extract out these discordant signals from the fluid “Out of Africa” narrative, but I would not be surprised if in the near future we stumble upon more and more “long branches” of less noteworthy quantity. Admixture is likely to be an old and persistent story in the hominin lineage, with only the most recent substantial bouts of separation and hybridization being of notice and curiosity at this moment in time.

What does all this mean? And why have I juxtaposed deep time natural history across the tree of life with inferences of relatively recent paleoanthropology? Let’s start with two propositions:

- Technological civilization, an outward manifestation of radically complex sentience, is not inevitable, though it is probable given certain preconditions (I believe that the existence of Homo increased its probability to ~1.0 over a reasonable time period)

- Radically complex sentience is not the monopoly of a particular exclusive lineage which accrues its genius from a particular specific forebear

John Farrell has pointed out the possible issues that the Roman Catholic church may have with the new model of human origins. But the Catholic church is only but a reflection of more general human strain of thought. Descent-groups, whether real or fictive, loom large in the human imagination. The evolutionary rationale for this is not too hard to explain, but we co-opt the importance of kinship in many different domains. Like evolution, human cultural forms simply take what is already present, and retrofit and modify elements to taste.

So why are humans special? And why do humans have inalienable rights? Many of us may not agree with the proposition that we are the descendants of Adam and Eve, and therefore we were granted the divine grace of eternal souls. But a hint of this logic can be found in the assumptions of many thinkers who do not agree with the propositions of the Roman Catholic church. Recently I listened to Sherry Turkle arguing against a reliance on “robot companions” which are able to exhibit the verisimilitude of human emotions for those who may be lacking in companionship (e.g., the aged and infirm). Though Turkles’ arguments were not without foundation, some of her arguments were of the form that “they are not us, they are not real, we are real. And that matters.” This is certainly true now, but will it always be? Who is this “they” and this “we”? And what does “real” mean? Are emotions a mysterious human quality, which will remain outside of the grasp of those who do not descend from Adam, literal or metaphorical?

If there arises a point where non-human sentience is a reality, do they have the same rights as we? Though the difference is radical in terms of quantity to some extent I think we know the answer: they are human by the way they are, not by the way their ancestors were. The “taint” of admixture with diverse lineages across the present human tree of life has not resulted in an updating of our understanding of human rights. That is because the idea that we are all the children of Adam, or the descendants of mitochondrial Eve, is a post facto justification for our understanding of what the rights of humanity are, adn what humanity is. And what it is is a particular ecological niche, a way of being, not being who descend down in a line of biological relationship from a particular person or persons.

* The cultural fundamentals of Old Kingdom Egypt arguably persisted in a living fossil form in the temple at Philae down to the 6th century A.D.! Therefore, a 3,500 year lineage of literature continuity.

Image credits: all public domain images from Wikpedia

November 26, 2011

How the worm turns the genic world

In the middle years of the last decade there were many papers which came out which reported many ‘hard’ selective sweeps reshaping the human genome. By this, I mean that you had a novel mutation arise against the genetic background, and positive selection rapidly increased the frequency of that mutation. Because of the power and rapidity of the sweep many of the flanking regions of the genome would “hitchhike” along, generating long homogenized regions of linkage disequilibrium. If that’s a little dense for you, just understand that very strong selective events tend to result in disorder and distinctiveness in the local genomic region.

But the late aughts and the early years of the teens are shaping up give us a more subtle picture. Instead of classic hard sweeps, researchers are suggesting that there may also be many ‘soft’ sweeps, where selection draws upon the well of standing genic variation. Instead of a novel trait becoming prominent, one tail of the distribution would rise in frequency. The ‘problem’ with this model is that it’s not as tractable as the earlier one of hard sweeps, and selection on quantitative traits with many loci of small effect is more difficult to detect. Its effect on the genome is more subtle and understated, which means that statistical tests often lack the power to grasp onto the underlying dynamics. Naturally this means that there is an extension of statistical techniques to ever greater degrees of sophistication. A new paper in PLoS Genetics attempting to tease apart the various potential selective pressures in the human genome is reflective of that tendency. Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution:

Previous genome-wide scans of positive natural selection in humans have identified a number of non-neutrally evolving genes that play important roles in skin pigmentation, metabolism, or immune function. Recent studies have also shown that a genome-wide pattern of local adaptation can be detected by identifying correlations between patterns of allele frequencies and environmental variables. Despite these observations, the degree to which natural selection is primarily driven by adaptation to local environments, and the role of pathogens or other ecological factors as selective agents, is still under debate. To address this issue, we correlated the spatial allele frequency distribution of a large sample of SNPs from 55 distinct human populations to a set of environmental factors that describe local geographical features such as climate, diet regimes, and pathogen loads. In concordance with previous studies, we detected a significant enrichment of genic SNPs, and particularly non-synonymous SNPs associated with local adaptation. Furthermore, we show that the diversity of the local pathogenic environment is the predominant driver of local adaptation, and that climate, at least as measured here, only plays a relatively minor role. While background demography by far makes the strongest contribution in explaining the genetic variance among populations, we detected about 100 genes which show an unexpectedly strong correlation between allele frequencies and pathogenic environment, after correcting for demography. Conversely, for diet regimes and climatic conditions, no genes show a similar correlation between the environmental factor and allele frequencies. This result is validated using low-coverage sequencing data for multiple populations. Among the loci targeted by pathogen-driven selection, we found an enrichment of genes associated to autoimmune diseases, such as celiac disease, type 1 diabetes, and multiples sclerosis, which lends credence to the hypothesis that some susceptibility alleles for autoimmune diseases may be maintained in human population due to past selective processes.

The authors utilized “Projection to Latent Structure multiple regression with an Uninformative Variable Elimination algorithm (UVE-PLS).” I know what multiple regression is, and the general logic which underpins the family of such methods. But I don’t really know what UVE-PLS is in its specifics, so I can’t speak with any intelligence on this technical issue. I assume as per most multiple regression the authors are attempting to tease apart the predictive power of various independent variables upon a dependent variable. In this case, the dependent variable happens to be the pattern of genetic variation, the single nucelotide polymorphisms (SNPs). It isn’t surprising that the biggest predictor of variation happens to be demographic relationship. That is, adjacent populations with recent common ancestors are going to share more genetic variants than those which are distant. The key is to control for this confound, and then see how genes vary according to other factors.

In this analysis they found that diet and climate seem to be less important than genes relating to immune response to pathogens, in particular those implicated in response to parasitic worms. Why worms? The argument they give is that these organisms are not quite so protean as bacteria and viruses, and also tend to be somewhat localized. Their relative sluggishness in adaptation means that humans presumably have some fighting chance in developing defenses, and their spatial stability also implies that human adaptations can differentiate nicely as a function of geography, as may be in the case in genes which are targets of local selection. I’m not quite sure about this idea that we’ve been able to adapt to parasitic worms better though. Rather, I just wonder if human adaptations to viruses and bacteria are simply not easily detectable by these methods. Or, as implied in the piece it may be that these are less locally conditioned, so you see a whole host of generalized adaptations which aren’t geographically constrained.

This is obviously not going to be the last word by any means. They focused on the data sets that were available and computationally manageable in 2011. Over the next 10 years researchers will be combing whole genomes of many individuals in many populations. They’ll come back with gold. It seems a forgone conclusion that loci implicated in response to pathogens are going to be rich candidates for bouts of natural selection. What is perhaps going to be more interesting is the question of what other traits are shaped by natural selection? The unequivocal list is rather short right now. Lactose tolerance, pigmentation, malaria, etc. It’s bound to get longer. The question is now human longer….

Citation: Fumagalli M, Sironi M, Pozzoli U, Ferrer-Admettla A, Pattini L, et al. 2011 Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution. PLoS Genet 7(11): e1002355. doi:10.1371/journal.pgen.1002355

November 5, 2011

How Archimedes’ lever explains human evolution

Last August I had a post up, The point mutation which made humanity, which suggested that it may be wrong to conceive of the difference between Neanderthals and the African humans which absorbed and replaced them ~35,000 years ago as a matter of extreme differences at specific genes. I was prompted to this line of thinking by Svante Pääbo‘s admission that he and his colleagues were searching for locations in the modern human genome which differed a great deal from Neanderthals as a way through which we might understand what makes us distinctively human. This sort of method has a long pedigree. Much of the past generation of chimpanzee genetics and now genomics has focused on finding the magic essence which differentiates us from our closest living relatives. Because of our perception of massive phenotypic differences between H. sapiens and Pan troglodytes the 95-99% sequence level identity is thought by some to be perplexing. Therefore models have emerged which appeal to gene regulation and expression, or perhaps other forms of variation such as copy number, to clear up how it can be that chimpanzees and humans differ so much. Setting aside that the perception of difference probably has some anthropocentric bias (i.e., would an alien think that chimpanzees and humans are actually surprisingly different in light of their phylogenetic similarities? I’m not so sure), it doesn’t seem to be unreasonable on the face of it to plumb the depths of the genomes of hominids so as to ascertain the source of their phenotypic differentiation.

But can this model work for differentiating different hominin lineages? Obviously there’s going to be a quantitative difference. The separation between chimpanzees and modern humans is on the order of 5 million years. The separation between Neanderthals and modern humans (or at least the African ancestors of modern humans ~50,000 years B.P.) is on the order of 500,000 years. An order of magnitude difference should make us reconsider, I think, the plausibility of fixed differences between two populations explaining phenotypic differences.


Backing up for a moment, why do we think there might be fixed differences between Neanderthals and modern humans? The argument, as outlined in books like The Dawn of Human Culture, is that H. sapiens sapiens is a very special lineage, whose protean cultural flexibility allowed it to sweep of the field of all other hominin sister lineages. The likelihood of some admixture from these “dead end” lineages aside, this rough model seems to stand the test of time. Consider that the Mousterian technology persisted for nearly 300,000 years, while the Oldowan persisted for 1 million! In contrast, our own species seems to switch and improve cultural styles much, much, faster. Behavioral modernity does point to a real phenomenon. The hypothesis of many scholars was that there was a genetic difference which allowed for modern humans to manifest language as we understand it in all its diversity and flexibility. The likelihood of this seems lower now that modern humans and Neanderthals have the same variants of FOXP2, the locus which seems to be correlated to elevated vocal and auditory capabilities across many vertebrate lineages. And, if it is correct that ~2.5% or so of modern human ancestry in Eurasia, and nearly ~10% in Papua, comes from “archaic” lineages, then I think that should reduce our estimates of how different these humans were from the Africans.

Therefore you can posit two stylized scenarios of contrasts between Neanderthals/modern humans and chimpanzees/modern humans. In one model the difference between the two comparisons is fundamentally of degree. Neanderthals and chimpanzees are still disjoint from modern humans. That is, there’s no overlap in the traits. But, Neanderthals are far closer to modern humans, as would be likely expected from the phylogenetic relationship. Another model though is that Neanderthals and modern humans did differ, but there was a great deal of overlap. This is a model with qualitative differences from that of chimpanzees vs. humans. If the second model is correct, and I think with all that we know from the Neanderthal genome project would should take it more seriously, then looking for disjoint pairwise differences in allele frequencies is not the way to go in understanding how the two human lineages diverged in phenotype.

In the second model, where there is a great deal of overlap, there is still a difference in the tails of the distribution. The idea I had in mind with my earlier post was that it is at these tails that the differences between Neanderthals and modern humans will be found when it comes to cultural differences. I think one might wonder where the Michelangelo or a Bachs of the Neanderthals were, but then one has to observe that the vast majority of modern humans are not Michelangelo or a Bachs! One of the primary indications of the transition to behavioral modernity is the proliferation of symbolism. But are we to presume that every member of an ancient Paleolithic tribe was equally capable of creative virtuosity? I think likely not. It could be that in fact the vast majority of “modern humans” are no different from all Neanderthals in the sort of things we might expect to be different across the two lineages. Rather, it may be that a small minority of modern humans crossed a particular threshold at the edge of the distribution of the phenotype, and when that transition was made the world was never the same.

Julius Caesar

I’m not proposing here that the victory of African humans ~50,000 years ago was due to artists. What I’m proposing is that at some point a critical mass of exceptional individuals arose. These individuals were possessed of peculiar characteristics, but instead of these characteristics making them outcasts, the qualities which they possessed were seen by their fellow humans as marks of greatness. In short, they were the children of gods among men.

Or perhaps demons. Men such as Alexander, Napoleon, and Hitler, were possessed of peculiar charisma, but whether they were good or evil is a matter of dispute and perspective. The point is not that they achieved greatness, but that they were the catalysts for a great number of events. As charismatic leaders they took collections of human beings, and turned them to their purpose. Individual humans became more than the sum of their parts, and for moments exhibited almost organismic levels of cohesion. Though the number one predictive variable in who won wars in the pre-modern world is the simple one of numbers, organization and structure also mattered. The Roman legion operating in a Testudo formation could beat off the attacks of more numerous barbarians who were physically more robust on a per person basis because the unit exhibited synergy, and translated cohesion into efficient collection action. This does not occur bottom up, but requires a personality type, a genius, to serve as the nexus or locus.

The model I have in mind then is one where the African humans faced up against their near relations, but not as one against one. Rather, under the guidance of charismatic leaders, Paleolithic megalomaniacs driven by fervid nightmares and irrational dreams, they ground through the many enemies who fought as sums of singulars as a cohesive social machine. It was not because they were superior on a per unit basis, but because they were superior on a per tribe basis, driven by individuals who turned the many to their own ambitions. With the lever of superior social organization the few moved the world, and swept over it. How many insane voyages were their east over the horizon from Sundaland before one tribe finally made landfall in Sahul? How many tribes perished in the ice of the far north, before some finally made it to Beringia? Why did humans look over the horizon, and venture out across the black waters? Perhaps just because they could. This answer is likely confusing and disquieting to many alive today, and perhaps it was disquieting to the more reasonable and level-headed “archaics” who were confronted with the zealous organizational insanity of the African humans who were rolling all opposition. But these insane individuals still move among us today, and they are still the objects of curiosity, fear, and adulation.

Is this a crazy model? Yes, somewhat. But is it really anymore crazy than the model that there is a mutation which can encapsulate all that differentiates man from beast-man? I think not.

October 25, 2011

The perils of human genomics

A friend pointed me to the heated comment section of this article in Nature, Rebuilding the genome of a hidden ethnicity. The issue is that Nature originally stated that the Taino, the native people of Puerto Rico, were extinct. That resulted in an avalanche of angry comments, which one of the researchers, Carlos Bustamante, felt he had to address. Eventually Nature updated their text:

CORRECTED: This article originally stated that the Taíno were extinct, which is incorrect. Nature apologizes for the offence caused, and has corrected the text to better explain the research project described.

Here’s Wikipedia on the Taino today:

Heritage groups, such as the Jatibonicu Taíno Tribal Nation of Boriken, Puerto Rico (1970), the Taíno Nation of the Antilles (1993), the United Confederation of Taíno People (1998) and El Pueblo Guatu Ma-Cu A Boriken Puerto Rico (2000), have been established to foster Taíno culture. However, it is controversial as to whether these Heritage Groups represent Taíno Culture accurately as some Taino groups are known to ‘adopt’ other native traditions (mainly North American Indian). Many aspects of Taino culture has been lost to time and or blended with Spaniard and African culture on the Caribbean Islands. Peoples who claim to be of native descent in the islands of Puerto Rico, Hispaniola and Eastern Cuba attempt to maintain some form of cultural connection with their historic identities. Antonio de Moya, a Dominican educator, wrote in 1993, “the [Indian] genocide is the big lie of our history… the Dominican Taínos continue to live, 500 years after European contact.”

One of the ways that Taino activists now use to strengthen interest and identity is by the creation of two unique scripts. The scripts are used to write Spanish, not a retained language from pre-Columbian ancestors. The organization Guaka-kú teaches and uses their script among their own members, but the LGTK (Liga Guakía Taína-ké) has promoted their script among elementary and middle school students to strengthen their interest in Taino identity.

It is undeniable that the Amerindian ancestry found in the Caribbean probably derives from that pre-Columbian population. And it may be that there are cultural forms which exhibit unbroken continuity. But it seems that the modern Taino are a re-precipitation out of a cultural milieu whose Amerindian self-identity had gone extinct. By analogy, Argentines have about the same proportion of Amerindian ancestry as Puerto Ricans on a population-wide basis. In fact, over 90% of the Amerindian distinctive ancestry in Argentina is not found in self-identified Amerindians (who do continue to exist as a minority, especially in the South). But to my knowledge for various cultural reasons there has not been a groundswell to shift the Argentine self-conception from being a European settler nation to a mestizo nation, let alone individuals declaring themselves Amerindian.

In comparison to the possibilities which are opened up in this case, the issue of Aboriginal genomics looks rather cut & dried. I suppose we would laugh if some people decided to “reclaim” their Neandertal heritage, but there’s a huge corpus of paleoanthropological scholarship which these individuals could draw upon to reconstruct their identities as Neandertals. It might sound ludicrous, but this is a world where a lot happens that you wouldn’t expect.

Older Posts »

Powered by WordPress