Razib Khan One-stop-shopping for all of my content

June 12, 2011

You are a mutant!

The Pith: You are expected to have 30 new mutations which differentiate you from your parents. But, there is wiggle room around this number, and you may have more or less. This number may vary across siblings, and explain differences across siblings. Additionally, previously used estimates of mutation rates which may have been too high by a factor of 2. This may push the “last common ancestor” of many human and human-related lineages back by a factor of 2 in terms of time.

There’s a new letter in Nature Genetics on de novo mutations in humans which is sending the headline writers in the press into a natural frenzy trying to “hook” the results into the X-Men franchise. I implicitly assume most people understand that they all have new genetic mutations specific and identifiable to them. The important issue in relation to “mutants” as commonly understood is that they have salient identifiable phenotypes, not that they have subtle genetic variants which are invisible to us. Another implicit aspect is that phenotypes are an accurate signal or representation of high underlying mutational load. In other words, if you can see that someone is weird in ...

March 9, 2011

Your genes, your rights – FDA’s Jeffrey Shuren misleading testimony under oath

Filed under: 23andMe,FDA,Genetics,Genomics,Jeffrey Shuren,Select Post — Razib Khan @ 12:05 pm

Over the past few days I’ve been very disturbed…and angry. The reason is that I’ve been reading Misha Angrist and Dr. Daniel MacArthur. First, watch this video:

In the very near future you may be forced to go through a “professional” to get access to your genetic information. Professionals who will be well paid to “interpret” a complex morass of statistical data which they barely comprehend. Let’s be real here: someone who regularly reads this blog (or Dr. Daniel MacArthur or Misha’s blog) knows much more about genomics than 99% of medical doctors. And yet someone reading this blog does not have the guild certification in the eyes of the government to “appropriately” understand their own genetic information. Someone reading this blog will have to pay, either out of pocket, or through insurance, someone else for access to their own information. Let me repeat: the government and professional guilds which exist to defend the financial interests of their members are proposing that they arbitrate what you can know about your genome. A friend with a background in genomics emailed me today: “If they succeed in ramming this through, then you will not be able to access your ...

February 23, 2011

Sweeping through a fly’s genome

Credit: Karl Magnacca

The Pith: In this post I review some findings of patterns of natural selection within the Drosophila fruit fly genome. I relate them to very similar findings, though in the opposite direction, in human genomics. Different forms of natural selection and their impact on the structure of the genome are also spotlighted on the course of the review. In particular how specific methods to detect adaptation on the genomic level may be biased by assumptions of classical evolutionary genetic models are explored. Finally, I try and place these details in the broader framework of how best to understand evolutionary process in the “big picture.”

A few days ago I titled a post “The evolution of man is no cartoon”. The reason I titled it such is that as the methods become more refined and our data sets more robust it seems that previously held models of how humans evolved, and evolution’s impact on our genomes, are being refined. Evolutionary genetics at its most elegantly spare can be reduced down to several general parameters. Drift, selection, migration, etc. Exogenous phenomena such as the flux in census size, or ...

January 24, 2011

The genomic heritage of French Canadians

Image Credit: Anirudh Koul

One of the great things about the mass personal genomic revolution is that it allows people to have direct access to their own information. This is important for the more than 90% of the human population which has sketchy genealogical records. But even with genealogical records there are often omissions and biases in transmission of information. This is one reason that HAP, Dodecad, and Eurogenes BGA are so interesting: they combine what people already know with scientific genealogy. This intersection can often be very inferentially fruitful.

But what about if you had a whole population with rich robust conventional genealogical records? Combined with the power of the new genomics you could really crank up the level of insight. Where to find these records? A reason that Jewish genetics is so useful and interesting is that there is often a relative dearth of records when it comes to the lineages of American Ashkenazi Jews. Many American Jews even today are often sketchy about the region of the “Old Country” from which their forebears arrived. Jews have been interesting from a genetic perspective ...

December 23, 2010

The paradigm is dead, long live the paradigm!

Mitochondrial DNA and human evolution:

Mitochondrial DNA from 147 people, drawn from five geographic populations have been analysed by restriction mapping. All these mitochondrial DMAs stem from one woman who is postulated to have lived ab7out 200,000 years ago, probably in Africa. All the populations examined except the African population have multiple origins, implying that each area was colonised repeatedly

And so was published in the year 1987 the paper which established in the public’s mind the idea of mitochondrial Eve, which gave rise to a famous cover photo in Newsweek. This also led to the Children of Eve episode on the PBS documentary NOVA. Here is the summary:

NOVA examines a controversial theory that traces our ancestry to a small group of women living in Africa 300,000 years ago.

As Milford Wolpoff has complained it is probably accurate to characterize the documentary as not particularly “fair & balanced.” Mitochondrial Eve may have been controversial, and subsequently plagued by issues of molecular clock calibration as well as spurious interpretations of the cladograms, but the tide of history was on its side, and PBS was telling that story. And the story was not just the primary science, rather, one had to understand the controversy in light of the debates among paleontologists and between paleontologists and molecular biologists. A group of researchers, spearheaded by Chris Stringer argued for the recent origin of modern humans from Africa on the basis of fossils alone. They were challenged by an established school of multiregionalists who argued for deeper roots of modern human populations, which derived from local hominins which diversified after the the migration of H. erectus out of Africa. The argument of the multiregionalists was that selective sweeps across the full range of the human populations gave rise gradually to modern humanity as we know it, a compound of specific ancient local features and trans-population characters which unified us into a broader whole. Stringer and company presented a simpler model where anatomically modern human being arose ~200,000 years ago in Africa, and subsequently expanded to other parts of the world, by and large replacing the local hominin populations. In the multiregionalist telling Neandertals became human beings, while Out of Africa would imply that Neandertals were replaced by human beings.

ResearchBlogging.orgInto this tendentious landscape of bones stepped the molecular biologists. The critical figure here is Allan Wilson, who in the 1970s argued forcefully from molecular clock evidence for a more recent separation of the human and ape lineage than paleontologists had favored. By the 1980s the paleontologists had generally conceded that Wilson et al. were correct. After this victory he put forward the mitochondrial Eve theory with his student Rebecca Cann. Here Wilson was getting involved with an argument about paleontology. From all the material I’ve read Wilson and Cann were confident that their techniques were superior to old fashioned analysis of fossils, a method which Wolpoff defended vociferously on NOVA. People who were not invested in recent human origins often did not know what to make of the debate. To give you a flavor of what was going on in the late 1980s, here’s Richard Leakey in Origins Reconsidered: In Search of What Makes Us Human:

……In the 1970s, I have been more reluctnant than most to accept Wilson and Sarich’s genetic evident in favor of a recent (five million years ago) origin of hominids, so I thought this would be a chance to redress the balance. In thecourse of my talk I mentioned the mitochondrial DNA evidence and indicated that “I was ready to be persuaded by it.” Surrounded as I was by molecular biologists and geneticists, I imagined it would be a wise think to do, and scientifically proper too.

I was therefore more than a little surprised when, in the bar after my talk, several participants, including the conference organizer, Stepehen O’Brien, cornered me and said, “You don’t have to swallow the Mitochondrial Eve line. We don’t.” Steve and his friends proceeded to tell me why they thought the Eve hypothesis was incorrect…Wilson may have miscalculated the rate of the mitochondrial clock, older mitochondria may have been lost by chance, promoted perhaps by occasional crashes in local pouplation size, natural selection may have favored some recent evolved mitochondrial variant, this eliminating the older lineages. Any of these possibilites might erroneously lave the impression of a recently emerged population….

…In February 1990, Milford and a half a dozen like-minded colleagues organized a session at the annual gathering of the American Association for the Advancement of Science, in New Orleans, the goal of which was to “nail this Mitochondrial Eve nonsense.” Speaker after speaker argued for evidence in support of regional continuity and against localized speciation; for alternative interpretations…It was a powerful presentation, and gathered a lot of press, with headlines like “Scientists Attack ‘Eve’ Theory of Human Evolution” and “Man Does not Owe Everything to Eve, Latest Finding Says.” Chris Stringer, who was speaking at a different session of the meeting, described the anti-Eve seminar as “high-powered salesmenship.” One of Milford’s assault team, David Frayer of the University of Kansas, summarized the deep reaction to Wilson’s work: “Fossils are the real evidence.”

In the 1990s Wolpoff came out with a book, Race and Human Evolution: A Fatal Attraction. It outlined a multiregionalist framework for the origin of modern humans, and also presented a wide ranging review of human paleoanthropology past to present, and, to my eyes made the case that the multiregionalists were on the “right side of history.” I was, and remain, a natural history nerd. Especially a natural history nerd of the human species. I devoured books on the topic in the 1980s and 1990s, and saw the slow shift away from multiregionalism toward an Out of Africa model as the orthodoxy, as transmitted by scientific journalists. As I did not have any horse in the race, it was not a matter of concern either way for me, but, I did observe that the disagreements were personal and sometimes politicized. Race and Human Evolution seems to have been written in part to debunk the idea that multiregionalism gave succor to racism. Rather, Wolpoff inverted the narrative, presenting Out of Africa models as genocidal and exterminationist, in contrast to his model of human populations gliding toward sapiency together through gene flow.

The flip side of course is that many people presented Out of Africa as anti-racist par excellence. Anatomically modern humans were portrayed as the latter day Julius Caesar’s of the hominin world. They came, they saw, and they conquered. The chasm between humans and non-humans may have been wide, but the more appealing aspect of the Out of Africa model is that we were the new kids on the block. All non-African humans derived from Africans, who were the reservoirs of our species’ genetic diversity. The dovetailing of implications of the model with the egalitarian ethos of the age was natural. Here is Pat Shipman in 2003, We Are All Africans:

I don’t expect that the subscribers of the Multiregional hypothesis will be waving a white flag of surrender, although they have lost the great majority of their supporters. At least one of the theory’s most ardent proponents, Wolpoff, is still steadfast in defense of the hypothesis he has so long espoused. While it remains possible that new findings will shift the balance in favor of the Multiregional viewpoint, the consilience of such evidence creates a powerful testament. It would take many new fossils and many new genetic studies to resculpt this intellectual landscape.

By and large the arguments which Shipman lays out were persuasive to someone like me who didn’t know much about bones & stones. Though even I knew of some instances of possible continuity, the mtDNA, Y chromosomal lineages, and autosomal results, did seem to roughly line up appropriately. In the battle between paleoanthropologists who saw continuity in the fossils and those who did not, it seemed reasonable to at the time to give the “tiebreaker” to the geneticists who were generating inferences consistent with Out of Africa.


With all that said, it has to be stated that paleoanthropologists such as Chris Stringer did not hold necessarily to total replacement of non-Africans. Total replacement may have been the case, but quite often they did qualify that there may have been some admixture and assimilation with the pre-modern substrate. But the paucity of the genetic data pointing to interbreeding between distant lineages (as opposed to a very recent exclusive common ancestry), especially once the Neandertal mtDNA was shown to be an outgroup, seems to have pushed people to the model where modern humans were an entirely different beast which simply wouldn’t have deigned to to have intercourse with the creatures of yore. In The Dawn of Human Culture the paleoanthropologist Richard Klein lays out a scholarly and measured argument for what is close to a maximalist case for the unique and distinctive nature of modern neo-African humanity:

……the simplest and most economic explanation for the “dawn” is that it stemmed from a fortuitous mutation that promoted the fully modern human brain….an acknowledged genetic link between anatomy and behavior in yet earlier people persisted until the emergence of fully modern ones and that the postulated genetic change 50,000 years ago fostered the uniquely modern ability to adapt to a remarkable range of natural and social circumstances with little or no physiological change.

Arguably, the last key neural change promoted the modern capacity for rapidly spoken phonemic language, or for what anthropologists Duane Quiatt and Richard Milo have called “a fully vocal language, phonemicized, syntactical, and infintely open and productive.”

Wolpoff was on to something. Even if the original Out of Africa proponents did not mean to do so, there was a tendency to remove “higher faculties” from the suite of capabilities of the evolutionary “dead ends.” We were H. sapiens sapiens. If we deigned to allow Neandertals to be a branch of our own species, their subspecies was distinctive. They were less than we in the ways in which modern humans were exceptional, and universal.

This orthodoxy probably resulted in a positive feedback loop for the educated public, in which I include myself. The more the Out of Africa model of neo-African human exceptionalism settled into the received wisdom, the more animalized Neandertals and other human lineages became. Naturally a multiregionalist model of continuity became distasteful, because continuity implied a connection between modern humans and subhumans. The fact that the largest cranial capacities in the whole human lineage were sported by Neandertals became a counterintuitive fact, which just went to show that it was quality, not quantity.

When I was a freshman at university I took a biological anthropology course. The instructor threw out a question to the class. He noted that some paleoanthropologists observed a continuity between the skulls of Australian Aborigines and some Southeast Asian erectine populations. Australian Aborigines are a very robust people, and have been less affected by the trend toward gracility which has been the norm over the past 10,000 years for most human populations. In any case, the instructor asked for a show of hands whether such a possibility should even be discussed openly. The solid majority of the class rejected an open discussion. When asked by the instructor why, many of the students who rejected an examination of the thesis argued that such a possibility opened the path to de-humanization, oppression, and was politically too sensitive. Milford Wolpoff had obviously lost the propaganda war. The students did not consider the possibility of multiregionalism where all human populations exhibited continuity, rather, they assumed that continuity hypothesized for Australian Aborigines was specific to them, and so would associate that population with the less human branches of the hominin tree.

Science is a human cultural endeavour. It is about something real, something objective, but we do look through the glass somewhat darkly. The acceptance or rejection of models are contingent upon correspondence to reality and precision of prediction. But the rise and fall of models, and the rate of their rise and fall, may be subject to cultural dynamics. In The Price of Altruism Oren Harman shows how the cultures of Russia and Britain shaped how they viewed the social implications of evolutionary biology. Similarly, Newtonian mechanics and Darwinian evolution may have been retarded in their initial acceptance in France due to reasons of language and national chauvinism.

Not only do scientific theories have to swim through the waters of suspicion and incomprehension across societies, but they also have to overcome the inevitable confounding of their natural inferences with normative ones. Newtonian mechanics, relativity, and quantum mechanics, have all had many peculiar and surprising downstream social consequences. The line made between these physical theories and models and sociology, epistemology, and spirituality, would likely have surprised their originators (OK, perhaps not Isaac Newton). But the human imagination is fertile, and many cognitive anthropologists argue that the connections and analogies that we make, in addition to our promiscuous pattern recognition, gives rise the baroque and baffling complexity that is culture.

By the mid-2000s the paradigm of Out of Africa had crystallized to such a point that even the fossils purportedly betrayed the multiregionalists. In Bones, Stones and Molecules: “Out of Africa” and Human Origins the authors made the case that the fossil record, and its pattern of variation, complemented the molecular record. That is, Chris Stringer was right. Other more computationally intensive analyses of morphological variation reportedly tended to support an Out of Africa model.

And yet just as Out of Africa seemed to have cleared the field, pointers in the other direction were bubbling up out of genomics and genetics. In 2006 Bruce Lahn at the University of Chicago published Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage. Nevertheless several years later there seems to have been no wide support for this hypothesis. For eample, No evidence of a Neanderthal contribution to modern human diversity. But there were other papers nonetheless. Deep Haplotype Divergence and Long-Range Linkage Disequilibrium at Xp21.1 Provide Evidence That Humans Descend From a Structured Ancestral Population. Genomics refutes an exclusively African origin of humans. Granted, this was a minority perspective. For the first few years the Neandertal genome project did not seem to support any admixture either. I saw Svante Paabo speak in late 20008, and he was absolutely unequivocal. No sign of admixture. Period.

But the equilibrium of scientific orthodoxy is not eternally robust to a hard exogenous shock of falsification. Yes, some scientists remain obstinate in the face of overwhelming evidence. One could argue Milford Wolpoff could be numbered amongst these. Fred Hoyle certainly was. But the tide turns. In the fall of 2009 Svante Paabo seemed to be far less unequivocal about the issue of admixture. Then, in the spring of 2010:

A test of the New Mexico team’s proposals may come soon. Svante Pääbo and colleagues at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, announced early last year that they had finished sequencing a first draft of the Neanderthal genome, and they are expected to publish their work in the near future. Pääbo’s earlier studies on components of Neanderthal genomes largely ruled out interbreeding, but they were not based on more comprehensive analyses of the complete genome.

Linda Vigilant, an anthropologist at the Planck Institute, found Joyce’s talk a convincing answer to “subtle deviations” noticed in genetic variation in the Pacific region.

“This information is really helpful,” says Vigilant. “And it’s cool.”

By this point, in April of 2010, some graduate students who were not involved in the project itself had seen hard copy drafts of the Neandertal admixture paper. Word was spreading. I already knew of its likely probability, which resulted in me turning on Google Alerts (which got me in trouble for “breaking embargo” on an embargo which I was never privy to). The hammer-blows against the old tried & true orthodoxy in 2010 were ripening throughout the year, and many people were “in the know.” In the age of transparency it is interesting that science naturally has a culture of some secrecy. Who wants to be scooped? But how sustainable is this really over the long term?

To use a religious analogy which some may find offensive, this was an instance where the heretics were once the high priests of the faith. The media reports from last spring made it clear that most of the principals involved did not initially believe that admixture had occurred. Rather, they assumed that the results they were getting were anomalies. Science is influenced by culture, but ultimately nature remains the final arbiter. The truth is what it is, and honest men and women give it its due.

At this point you presumably know the score. Ancient DNA is a powerful judge and jury. It seems that the evidence for Neandertal admixture is already modifying the conventional Out of Africa narrative. But, it has to be admitted that Out of Africa is predominantly correct. The vast majority of our total genome content seems to be traceable to African populations within the last ~100,000 years. An older model of deep rooted lineages only periodically punctuated by selective sweeps which maintain species cohesiveness is not tenable. Phyletic gradualism seems implausible in light of the genetic evidence. Here is Wolpoff (and his wife, Rachel Caspari) in Race and Human Evolution:

We agree a punctuated evolutionary pattern best describes the evolutionary histories of many phyletic groups, including, we think, the earlier and much longer part of human prehistory when humans were only another African primate species. But we believe punctuated equilibrium does not reflect what happened to humans in the later part of human evolution as they became successful colonizers and when there was no macroevolutionary change. As we read the fossil record, there is no evidence of speciation events in the recent past; in fact, there is strong evidence against them. But the Eve interpretation promised to support a punctuated model for later human evolution that was denied by interpretations of the fossil evidence such as ours.

I’m not knowledgeable enough to know what would qualify as a “macroevolutionary change.” But the ‘Great Leap Forward’ seems a plausible candidate. Whatever the details, between 200 and 10 thousand years ago, there does seem to have been a series of rapid expansions of the human range and capacity for innovation. Sometime different was in the air. I do not know the nuance of Milford Wolpoff’s thinking. The most recent data do seem to refute the contention that all ancestry but the Out of African is trivial. But, they also seem to be broadly in line with the peculiarity, almost revolutionary character, of the changes in the human lineage over the past 200,000 years. Convergent patterns of morphological and genetic variation which seem to root back to an African base indicate that Chris Stringer and Allan Wilson had properly characterized a major first order dynamic in recent human prehistory. But now we move into the second and third orders. The rough paradigm is getting sculpted into something with more verisimilitude when judged against the diversity and peculiarity of nature.

Let’s jump to the paper. The main course. Genetic history of an archaic hominin group from Denisova Cave in Siberia:

Using DNA extracted from a finger bone found in Denisova Cave in southern Siberia, we have sequenced the genome of an archaic hominin to about 1.9-fold coverage. This individual is from a group that shares a common origin with Neanderthals. This population was not involved in the putative gene flow from Neanderthals into Eurasians; however, the data suggest that it contributed 4–6% of its genetic material to the genomes of present-day Melanesians. We designate this hominin population ‘Denisovans’ and suggest that it may have been widespread in Asia during the Late Pleistocene epoch. A tooth found in Denisova Cave carries a mitochondrial genome highly similar to that of the finger bone. This tooth shares no derived morphological features with Neanderthals or modern humans, further indicating that Denisovans have an evolutionary history distinct from Neanderthals and modern humans.

John Hawks has covered a great deal of ground in his FAQ. In particular, he has a gestalt understanding of the fossil record so he can run “quick & dirty” checks on some of their assertions. He notes:

What the paper doesn’t point out is that there are Upper Paleolithic specimens that equal or exceed this tooth in size. For example, the measured length and breadth of an upper second molar from Oase, Romania, are larger than this specimen, and the third molar (in the crypt) of that specimen is yet larger. There is an Upper Paleolithic-associated molar from Turkey which is also exceedingly large.

I don’t take that as a sign of relationship between this specimen and early Upper Paleolithic people — even though these are some of the earliest. It is another sign of how non-diagnostic this tooth actually is. I would say that in the absence of genetic information, we’d be looking at these remains as likely early Upper Paleolithic people, and accentuating these similarities.

People interpret information in light of their background priors. Now that we know what we did not, it may behoove us to go back and double check we may once have dismissed. Consider this paper from 2006, Archaic admixture in the human genome:

One of the enduring questions in the evolution of our species surrounds the fate of ‘archaic’ forms of Homo. Did Neanderthals go extinct without interbreeding with modern humans 25–40 thousand years ago or are their genes present among modern-day Europeans? Recent work suggests that Neanderthals and an as yet unidentified archaic African population contributed to at least 5% of the modern European and West African gene pools, respectively. Extensive sequencing of Neanderthal and other archaic human nuclear DNA has the potential to answer this question definitively within the next few years.

5% is a nice round number. They could have lucked upon it, but the first author continued to plunge onward in 2009, generating models of archaic admixture. How fruitful would this be? Here is Sarah Tishkoff in December of 2009:

…Sarah Tishkoff, a geneticist at the University of Pennsylvania, agrees, adding that, after all, every population has a strong selective pressure for intelligence, the better to succeed in its respective environment. As far as consorting with Neanderthals, Tishkoff dismisses that notion as pure speculation: “I don’t know of any evidence for that.”

I suspect that Sarah Tishkoff’s opinion would have been common among most scholars of human evolution in late 2009 (though I suspect those who were Facebook friends with people in Svante Paabo’s lab perhaps not). To be fair to Tishkoff, she had no compunction about accepting Neandertal admixture six months later when presented with evidence. She even added that “…it is possible that interbreeding introduced traits into a few human populations.”

In regards to the paper, the top line is rather clear in the three figures in the article proper. I’ve reformatted them a bit below:

Top left: a phylogenetic tree which shows the total genome relationship of various human lineages. Extant modern humans represent one clade. The Denisovans and Neandertals another. In other words, the last common ancestral population of Denisovans and Neandertals is shallower in time than the last common ancestral population of neo-Africans and the Denisovans and Neandertals. All the Neandertals also are very closely related, at least when graded on this particular curve. The Denisovans are outgroups to them, just as the San are outgroups to other humans. The French are an outgroup to the Han and Papuans, though just barely. This sort of relationship is naturally why I cast a skeptical eye to arguments of the common ancestry of French and Han 20,000 years ago when we know that the Papuans settled their island 45,000 years ago.

Top right: a PCA where HGDP populations are projected onto the two largest components of variation which shake out of a data set of a chimpanzee, Denisovan, and Neandertal. In other words, the ones deciding the rules of the game here are chimps and the two archaic Eurasian populations. Humans are constrained onto the genetic variation space of non-/pre-humans. So the position of the humans tells you how they relate to the genetic variation of the Denisovans, Neandertals, and chimpanzees. The Eurasicans, Eurasians + Amerindians, form a relatively tight cluster, apart from Africans. If non-Africans have some Neandertal admixture, this is reasonable. But interestingly the Melanesian groups stand apart as well. And, Papuans and Bougainville Islanders are also distinctive. The latter are shifted toward Eurasicans. Why? Probably because they have a minor, but significant, Austronesian ancestral component which the Papuans lack.

They estimate that 2.5% of the genes of Eurasicans and Oceanians is of Neandertal origin. And, a further 5% of the Melanesian genome is of Denisovan origin. So Melanesians are 92.5% neo-African. Eurasicans are 97.25% neo-African. At most.

Bottom: the last shows a stylized demographic model. Step 1, humans leave Africa. The neo-Africans interbreed with southwest Asian Neandertals. Step 2, the paleo-Eurasians push east, and some encounter the Denisovans, eventually reaching Sahul ~45,000 years ago.

Some people have asked me about the Denisovan in Polynesians and Australian Aborigines. Since Polynesians are ~20% Melanesian, they should have a fraction diluted appropriately. As for Australians, if they are only recently distinguished from the peoples of Papua because of rising sea levels I assume that they should carry the same fraction of Denisovan. Bougainville has always been isolated from Papua by water from what I know. A final question is in regards to Andaman Islanders and other isolated Asian peoples who seem to be hunter-gatherer relics such as the Ainu. Since the Pakistani HGDP populations share a large minor component of ancestry with the Andaman Islanders my assumption is that they should be somewhat deviated toward the Papuans. As the populations are not labeled I do not know if those groups are skewed toward the direction of the Papuans. In the supplements individual outcomes are given for the Han and French, and the Han seem somewhat shifted toward the Bougainville Islanders, though trivially. Additionally, some of the authors of this paper were involved in Reconstructing Indian History, and so I assume had access to Andaman Islander data. I would be curious if they ran some quick checks and decided to stick with the HGDP because there was unlikely to be anything there.

The main body of the paper is tightly and elegantly written. But there is so much more in the supplements. I have read through them at least once, but I can’t say I understand it very well. It is written with the tight economy of a mathematically minded individual, despite the fact that it runs to 90 pages. But much of it alludes to a “D-statistic” which actually goes back to the earlier Neandertal admixture paper, and its supplement. So let’s go back to that, and review the D-statistic at least cursorily. One might not gain a deep knowledge, but even a superficial knowledge of the technical arcana of these sorts of papers are often useful in my experience. To page 130:

To test whether Neandertals share more alleles with some present-day human populations than with others, we compared the Neandertal sequence that we generated to sequence from present-day human samples of diverse ancestry. Specifically, we discovered single nucleotide polymorphisms (SNPs) by comparing exactly two chromosomes from different individuals (H1 and H2). We then assessed whether a test individual (H3, e.g. Neandertal) tended to match either H1 or H2 more often at sites where H3 has the derived allele relative to chimpanzee. Under the null hypothesis that H3 belongs to an outgroup population, it should match H1 and H2 equally often. In contrast, if gene flow has occurred, H3 may match one more than the other.

Here’s a graphical illustration:

The ancestral state is A, which the chimpanzee (not shown as H4) presumably has. B represents the derived state. That means it has changed via mutation from the ancestral state at some point from the last common ancestor with the outgroup. To calculate the D-statistic you are looking at a case where H3 is B and H4 is naturally A. So you have two sets: BABA and ABBA. You are comparing the counts between these two combinations. If H3 is a clean outgroup to the H1H2 clade, D will be ~ 0, as BABA and ABBA counts will approximately be equal. In contrast, if there is gene flow to H1 or H2 from H3, D will deviate from ~0. The Z-score are the standard deviations away from ~0. The table below is from the current paper under consideration. I have highlighted and reformatted:

The D-statistics make sense of what you know verbally. There is some admixture from Neandertals to Eurisicans + Oceanians. Therefore when paired with each other as H1 and H2 they do not deviate as from 0 as much as they do when paired with Africans. There is a deviation away from equal ratios of ABBA and BABA because there is putative gene flow from from H3 to H1 or H2. Notice the Denisovans. Because they’re like Neandertals they produce some elevated deviation from D, though not as much. Interestingly the maximum Z-scores occur when comparing Denisovans, Melanesians, and Africans. Finally, Melanesians and Eurasicans also result in a deviation from 0 when paired with Denisovans in the H3 position.

A quick note from the supplements on ancient population structure. Dienekes does not believe that there was Neandertal admixture necessarily among Eurasicans and Oceanians. From what I can gather he believes that there was population structure within Africa, which is preserved in non-African populations. Rather than exogenous admixture between geographically separated lineages which had only recently met, what one is presumably arguing for here is that there were long term barriers between more closely placed populations in Africa. The authors do not find it parsimonious, though they can not reject it as totally without foundation. Below is a graphical representation of their two models:

So where does this leave us? Yesterday when I said something big was going to drop Ed Brayton expressed some frustration that paleoanthropologists tend to hype stories too much. The reality is everything doesn’t change. The Hobbits, the Darwinius fiasco, and the persistent controversy over Ida, can give anyone fits of human evolution fatigue. But there is a difference here. You don’t need to take their total word for it. At some point you will be able to go to the UCSC genome browser and poke around yourself. Or, you can pull down a 153 MB file with SNPs and indels.

This is a great time to be alive if you’re a hominin natural history nerd. You never know what surprise will greet you when you wake up in the morning. You never know how you’ll have to rearrange your conception of the world. Earlier in the post I mentioned that an instructor once asked a class where I was a student whether scientists should be allowed to talk about the erectine features of Aborigines, if they believed such features existed. You probably won’t be surprised that I said that such things shouldn’t be off limits if they seemed true. Obviously science has political implications. It is idealistic and philosophically consistent to say that it is value-free, but it is also naive. Rather, we need to think hard about how our values relate to the world around us. Or at least some of us need to think hard about that sort of thing.

We shouldn’t take for granted that we all have exactly the same moral intuitions. But on the margin some of our fears are I think overwrought. I know of an individual who admits frankly that they are a “blank slate” maximalist because they don’t know how they could sleep or live if many traits had some hereditary component. Similarly, I have met many conservative Christians and Muslims who admit that they would rape, murder and steal if they didn’t believe in God. In other words, if God doesn’t exist they would become psychopaths, because “why not.” This is ludicrous. God doesn’t exist, and they aren’t psychopaths. They may believe that they aren’t sodomizing their sister because the Lord God declared from On High believes that such behavior is forbidden, but I think that’s ridiculous on the face of it (on the margin there may be some effect of belief in God on behavior by the way, but that’s not what I’m getting at here obviously). Everything may be possible, but everything is not palatable. As for the possibility that humans may differ substantially from individual to individual and group to group, if you acknowledge this one day will you then as a matter of course raise in your arms in salute? If so, it is true that humans differ profoundly in matters of moral sense, because I could not comprehend such behavior.

So Papuans, and likely Aborigines, are likely ~7.5% non-neo-African. Does that matter? Do they bleed today where yesterday they did not? In deep matters of substance nothing is different from this moment than before. Let me quote John Hawks:

Our common ancestry as humans goes back to the Early and Middle Pleistocene. The (now multiple) Neandertal genomes and the Denisova genome share genes with some people and not others because of this common ancestry.

In addition, some living people carry even more genes from Neandertals because they have an appreciable fraction of Neandertal ancestry. That makes it nonsensical to talk about “Neandertals and the ancestors of modern humans”. Neandertals are among the ancestors of modern humans.

Just so with Denisova. It’s nonsensical to talk about a three-way split between Neandertals, Denisova and modern humans. We can talk about a population model with a clade separating an ancestral Neandertal-Denisova population from contemporary Africans.

I have to remind myself again and again when I talk to people about these issues that “modern human ancestors” is not a group that excludes these Pleistocene people.

Once we put ourselves into the mode where we are referring to a population model, it is important to recognize the limitations of those models. For example, we cannot presently exclude many kinds of gene flow among these Pleistocene populations. We can understand some limits to the level of gene flow — these populations were highly structured, it wasn’t Pleistocene panmixia. But it is premature to talk about isolation without recognizing the limits of our ability to test these population models.

The difficulty with terminology tells us something very important. A large-scale reorganization of the science of human origins is upon us. The terms we are used to using will, many of them, become obsolete. Some now-obscure terms will become very important.

What we know to be good and true is still good and true. It is a small soul who is so moved by matters of terminology, we should be cautious of allowing that to happen to ourselves. I think now to the fact that both the Romans and Muslims abhorred the idea of the king. The Romans overthrew their monarchy, established a republic, and replaced it with a despotism which was a monarchy in all but name. The Muslims had caliphs, vice-reagents of God, and sultans and emirs, who were vice-reagents of the caliphs. Despite the glory which is given over to their God the Muslim despotisms were things of men. Domination of the many by one is a matter of substance, not style. Human dignity should not be contingent on details of ancestry. Isn’t that obvious? I thought that was what the 20th century was to some extent all about.

Back to the science. I began with a long historical sketch, viewed through my own personal lens, because probabilities are always filtered through a glass of accreted priors. I was not as shocked by many at the idea of intogression and admixture because Greg Cochran, Henry Harpending, and John Hawks had already predisposed me to think about the plausibility of such phenomena. Additionally, I have always had an interest in conservation genetics, as well as modeling cultural evolution. Such lateral flows are not unknown in those domains. When I first discussed the Neandertal admixture results with Oren Harman last spring he reminded me that one should be cautious of such things; many splashy science stories often don’t pan out. And yet with all due respect to Oren, in this case we do need to observe that there has been a veritable mob of scholars pouring over these data. Additionally, this is something old, not something new.

These results will not remain isolated findings with only parochial relevance. I believe these two papers will probably shift the equilibrium orthodoxy in a new direction. Old models and genetic studies will be seen in a new light. Anomalies unconsidered will get a second look. In The New York Times Stanford geneticist Carlos Bustamante seemed to indicate to Carl Zimmer that the hunt was on. Perhaps the human genome is more of a mosaic than we thought?

Finally, one wonders how this was missed. 7.5% is not trivial. And yet a generation of mtDNA and NRY studies have seemingly missed this. I presume that the archaic admixture didn’t show up in STRUCTURE because it’s a stabilized part of the genetic background of Eurasicans and Oceanians. It reminds of us the limitations of interpretation. We know what we know contingent on what we already know. Since we know more, a different set of inferences may now be generated. Though with due humility. Not quite time yet for the hardening of a new orthodoxy.

Personal note: Merry Christmas! Obviously it is time for me to take a break. Best wishes, and let’s make 2011 more informative and data rich. Hopefully we won’t have to wait too long for Otzi’s genome.

Citation: Reich, David, Green, Richard E., Kircher, Martin, Krause, Johannes, Patterson, Nick, Durand, Eric Y., Viola, Bence, Briggs, Adrian W., Stenzel, Udo, Johnson, Philip L. F., Maricic, Tomislav, Good, Jeffrey M., Marques-Bonet, Tomas, Alkan, Can, Fu, Qiaomei, Mallick, Swapan, Li, Heng, Meyer, Matthias, Eichler, Evan E., Stoneking, Mark, Richards, Michael, Talamo, Sahra, Shunkov, Michael V., Derevianko, Anatoli P., Hublin, Jean-Jacques, Kelso, Janet, Slatkin, Montgomery, & Paabo, Svante (2010). Genetic history of an archaic hominin group from Denisova Cave in Siberia Nature : 10.1038/nature09710

November 17, 2010

Homozygosity runs in the family (or not)

800px-IMGP2147The number 1 gets a lot more press than -1, and the concept of heterozygosity gets more attention than homozygosity. Concretely the difference between the latter two is rather straightforward. In diploid organisms the genes come in duplicates. If the alleles are the same, then they’re homozygous. If they’re different, then they’re heterozygous. Sex chromosomes can be an exception to this because in the heterogametic sex you generally have only one copy of gene as one of the chromosomes is sharply truncated. This is why in human males are subject to X-linked recessive traits at such a great frequency in comparison to females; recessive expression is irrelevant when you don’t have a compensatory X chromosome to mask the malfunction of one allele.

Of course recessive traits are not simply a function of sex-linked traits. Consider microcephaly, an autosomal recessive disease. To manifest the trait you need two malfunctioning copies of the gene, one from each parent. In other words, you exhibit a homozygous genotype with two mutant copies. I suspect that this particularly common context of homozygosity, recessive autosomal diseases, is one reason why it is less commonly discussed outside of specialist circles: there are whole cluster of medical and social factors which lead to homozygosity which are already the focus of attention. The genetic architecture of the trait is of less note than the etiology of the disease and the possible reasons in the family’s background which might have increased the risk probability, especially inbreeding. In contrast heterozygosity is generally not so disastrous. Even if functionality is not 100%, it is close enough for “government work.” The deleterious consequences of a malfunctioning allele are masked by the “wild type” good copy. The exceptions are in areas such as breeding for hybrid vigor, when heterozygote advantage may be coming to the fore. The details of complementation of two alleles matter a great deal to the bottom line, and the concept of hybrid vigor has percolated out to the general public, with the more informed being cognizant of heterozygosity.

ResearchBlogging.orgBut homozygosity is of interest beyond the unfortunate instances when it is connected to a recessive disease. Like heterozygosity, homozygosity exists in spades across our genome. My 23andMe sample comes up as 67.6% homozygous on my SNPs (which are biased toward ~500,000 base pairs which tend to have population wide variation), while Dr. Daniel MacArthur’s results show him to be 68.1% homozygous across his SNPs. This is not atypical for outbred individuals. In contrast someone whose parents were first cousins can come up as ~72% homozygous. This is important: zygosity is not telling you simply about the state of two alleles, in this case base pairs, it may also be telling you about the descent of two alleles. Obviously this is not always clear on the base pair level; mutations happen frequently enough that even if you carry two minor alleles it is not necessarily evidence that they’re identical by descent (IBD), or autozygous (just a term which denotes ancestry of the alleles from the same original copy). What you need to look for are genome-wide patterns of homozygosity, in particular “runs of homozygosity” (ROH). These are long sequences biased toward homozygous genotypes.

220px-Morgan_crossover_1What ROH can tell you about an individual, and perhaps a population, becomes more clear when you conceptualize in your mind’s eye the basic dynamics which occur in the course of biological replication in diploid sexual organisms. Each individual receives half their autosomal genome from each parent. Though genes are abstractions, individual units at the root of a complex causal sequence which maps to a phenotype, a trait, they’re also physical entities embedded within the structure of DNA. This structure is a physical sequence, whereby you have adjacent base pairs, clusters of which define genes, intergenic regions, exons, introns, promoters, etc. In other words, the whole alphabet soup of molecular genetics. The spatial relationship of genes to each other along the chromosome allowed for linkage mapping decades before the biophysical substrate of DNA was known to be critical to the whole process. Particular sequences of alleles may therefore be inherited together, and form a haplotype. Over the generations the associations of these distinctive alleles in haplotypes dissolve through recombination, a physical process which erodes the structural integrity of chromosomal sequences.

210px-Juan_de_Miranda_Carreno_002With these basics in mind, let’s move to a specific repulsive example. Imagine a father who impregnates his daughter. Why is this repulsive to us? From a consequential “gene’s eye” perspective the father is suborning the beauty of sexual reproduction whereby genetic variation is mixed & matched across individuals. Colloquially, where the daughter would be 50% of the father genetically, the child of the daughter and her father would be 75% of the father genetically. From a gene-only perspective this may be favorable, as the father is coming closer to cloning himself, but we all know that the rate of breakdown of the “vehicle” in these individuals is high. Why? Inbreeding leads to a relatively massive increase in homozygosity as chromosomal segments identical by descent are paired off against each other. We know that the problem is that a host of nasty recessive diseases are highly likely in inbred individuals.

All humans carry a large load of deleterious alleles. Some of these may be potentially lethal. But like bombs without the trigger a functional copy of the alleles complements and masks the mutant variety and we carry on. Many of these mutants are particular to our family, and some of them are private even to ourselves, the outcome of de novo mutations which make each human distinctive genetic islands (at least until they reproduce and pass on their mutational distinctions). Therefore a man who mixes his own genes together in the act of incest is potentially lighting the fuse whereby these hidden malevolent mutants will explode from being cryptic genetic abormalities toward full-blown disease monstrosities.

One statistic which would register incest would be ROH; naturally when you have long regions of recently IBD chromosomal segments adjacent to each other you’ll have a lot of homozygosity, since the paired alleles are replica copies. Assuming that an individual with many long ROH can survive and reproduce over time these massive swaths of homogeneity will be wiped away by mutation and recombination as well as outbreeding. Incest is still arguably a health disaster, but one can imagine the motive genetic engines of evolutionary variation healing the damage over time.

And it doesn’t have to be so extreme. Father-daughter or sibling incest is only a boundary condition. First cousin marriages aren’t nearly as disastrous, the fecundity of British Pakistanis despite higher rates of genetic abnormalities being clear evidence of this. They are certainly more evolutionarily fit than non-Pakistani Brits, who do not reproduce at the clip of 4 children per family. These clans will exhibit more modest levels of ROH because the coefficient of relationship between cousins is only 1/8, as opposed to 1/2 between parents and children or full siblings.

roh1The figure to the left is from a 2008 paper on ROH in Europeans. Specifically these are Orcadians or part-Orcadians. A population you should be familiar with from the HGDP panel. Orcadians are natives of the Orkney islands just off the north coast of Scotland. Though of somewhat diverse origins, Viking, Scot and Pict, being islanders they’ve developed their own genetic peculiarities because of their isolation. A good rule of thumb is that any body of water is a fearsome barrier to casual gene flow. On the y-axis you see the total number of ROH in the genome of a given individual. I point you to the methods if you are curious as to the exact parameters they specified in their calculation. ROH is assessed over a window of the genome, and naturally one can vary its width, as well as the stringency in registering a particular region as a run or not a run. On the x-axis are the total lengths in terms of base pairs. What you see is a positive correlation between the number of ROH, and the total genomic length of the sequences. Those Orcadians who are genetically more diverse because of non-Orcadian parentage have the least homozygosity in their genomes. Those who are products of the recent cousin marriage have the most. But notice a peculiar pattern: there’s a curvilinear trend to the values. In those individuals who presumably have very high inbreeding coefficients the total length of ROH seems to exceed one’s expectation based on just the total number of ROHs. Why? Because they have very long runs of homozygosity indeed. This is just what we’d expect from the sort of process I described earlier, where it takes many generations for the long chromosomal sequences to be broken apart by recombination.

Before I get you too excited about the genetics of European homozygosity, let’s take a wider view. Some of the same researchers who published the paper above have come out with a set of results which survey the world. Genomic Runs of Homozygosity Record Population History and Consanguinity:

The human genome is characterised by many runs of homozygous genotypes, where identical haplotypes were inherited from each parent. The length of each run is determined partly by the number of generations since the common ancestor: offspring of cousin marriages have long runs of homozygosity (ROH), while the numerous shorter tracts relate to shared ancestry tens and hundreds of generations ago. Human populations have experienced a wide range of demographic histories and hold diverse cultural attitudes to consanguinity. In a global population dataset, genome-wide analysis of long and shorter ROH allows categorisation of the mainly indigenous populations sampled here into four major groups in which the majority of the population are inferred to have: (a) recent parental relatedness (south and west Asians); (b) shared parental ancestry arising hundreds to thousands of years ago through long term isolation and restricted effective population size (Ne), but little recent inbreeding (Oceanians); (c) both ancient and recent parental relatedness (Native Americans); and (d) only the background level of shared ancestry relating to continental Ne(predominantly urban Europeans and East Asians; lowest of all in sub-Saharan African agriculturalists), and the occasional cryptically inbred individual. Moreover, individuals can be positioned along axes representing this demographic historic space. Long runs of homozygosity are therefore a globally widespread and under-appreciated characteristic of our genomes, which record past consanguinity and population isolation and provide a distinctive record of the demographic history of an individual’s ancestors. Individual ROH measures will also allow quantification of the disease risk arising from polygenic recessive effects.

Their data set consists of the HGDP sample populations, so you naturally have the broad geographic clusters such as Africa, Europe, West Asia, Central/South Asia, East Asia, Oceania, and the New World. Two big dynamics are superimposed upon each other in the patterns of ROH: “deep history” demographic processes such as bottlenecks and population expansions, and cultural anthropological patterns which we see around us such as cousin marriage within inbred clans. To find the former you need to survey the genome finely. In contrast the latter leaves pretty obvious signs genomically in the form of very long ROH, as well as clusters of recessive diseases.

The first figure shows the distribution of different lengths of ROH by population:


Here’s the take away:

- Oceanians have many short ROH, but as you increase the length of ROH threshold they are not exceptional at all

- The New World samples persist in having a disproportionately number of ROH no matter the length, though the number does drop as you increase length threshold. This makes sense, the human genome is of finite length and you can only have so many very long ROHs

- The West Asian and Central/South Asian populations seem to have more long ROHs than the other Eurasian or African groups, though they’re not exceptional in the lowest category

- The Africans have the least ROH, especially in the category of very short runs

Before I comment on these patterns in detail, let’s quickly check out the next figure. It looks at Africans only, but divides the sample into those which are hunter-gatherers and those which are agriculturalists.


The hunter-gatherers have more, and longer, ROH than the agriculturists. Why? The answer in large part explains the geographical patterns as well: larger long term effective population. Effective population just refers to the proportion of the population which contributes genetically to the next generation. Small effective populations means a lot of genetic drift because of increased sample variance, and tends to converge upon consanguinity. If your tribe is small enough the only people you may find to marry are your cousins. As I noted above, this will produce long ROH as individuals will have descent through multiple lines from the same ancestor, increasing the probability of autozygosity greatly. The same process explains why West Asians and Central/South Asians are enriched for long LOH relative to other groups excepting Amerindians. Here’s a map from Consang.net:


Many Muslim societies practice cousin marriage, and many Muslims even argue that it is the Islamic practice (he married one of his cousins among his many wives. Strangely somehow these Muslims don’t argue that it is also the Muslim custom to marry old rich widows, though some do argue for the importance of marrying barely pubescent girls). Additionally, in India many Hindu groups in  the South practice consanguineous marriages, including uncle-niece marriage. This is all occurring now, and so produces signatures of long ROH in many families. The final figure breaks down the individuals from selected populations, with again the y-axis being the number of ROH and x-axis being total length of the ROH:


The population sets are representative of broader geographic clusters. The Karitiana are from the Amazon, the Mandenka from Senegal, and the Balochi from Pakistan. If you don’t know where the French and Japanese are from, I would ask you never leave a comment on this weblog. Notice a few French, Mandenka, and Japanese individuals deviated away from their main clusters. These are cryptically inbred, perhaps their parents were cousins, or some of their grandparents were cousins. In contrast the Baloch have a wide range in terms of length of ROH; this is typical of populations where a large proportion of individuals are the products of cousin marriage, but many are not. The fact that individuals would exhibit a large variance of expected relatedness between their parents means that their own inbreeding coefficients and the genomic correlates (in this case ROH) would also vary greatly. The same parameter is operative among the Karitiana, an endangered ethnic group which presumably has a small “mate market” available to each individual.

So what about the Papuans? Their cluster is tight, and they don’t have nearly the total length of ROH as the Amazonian tribe. But remember that in the first figure they had many short ROH. A plausible explanation for this is the the Papuans went through an ancient bottleneck, from which they have expanded. The bottleneck increased genetic drift and so generated highly common haplotype blocks which combined to produce runs of homozygosity. But over time these blocks would have disintegrated through mutation and recombination. ROH in the Papuans then is simply a shadow of demographic events past, while ROH in Baloch is evidence of demographic events present.

roh2These two balancing realities are starkly illustrated in the supplements when you drill down to the South and Central Asian groups. In the figure it is clear that the group with the consistently highest number of ROH are the Kalash. This makes sense. The Kalash are a genetic isolate because they’re traditionally a pagan non-Muslim group isolated in the remote Chitral region of Pakistan. Because Muslims can not join their tribe for over a thousand years the gene flow has been unidirectional, as the Kalash convert to Islam and so assimilate into the broader Pakistani society. In contrast the other Pakistani groups have a huge variance in the total amount of ROH. The individuals with the least ROH in both total length and number in the sample are Baloch, Brahui and Makrani, as are some of the individuals with the highest values on these statistics! While the Kalash have been slowly and consistently ground down by the pressure of small population size, the Baloch, Brahui, and Makrani, are subject to the hammer-blows of several generations of first cousin marriages in inbred clans. These repeated marriages across the generations rapidly increase the ROH as first cousins may be more closely related to each other genetically than they are anthropologically.

roIn the pre-genomic era it was simple to calculate inbreeding. Just look at pedigrees. From this you derived the inbreeding coefficient. The key is to remember that the relationship of one’s sum totality of ancestors were critical in this calculation. In the USA marriages between first cousins occur between individuals whose grandparents are not usually related. But in other societies the generation of the grandparents, and perhaps great-grandparents, may also have been cousins. But pedigrees have limits, and may miss deep ancestry.  The figure to the left, from the first paper I referenced, shows the relationship of the proportion of an individual’s ancestry which is identical by descent as calculated by genomic (ROH) methods on the y-axis and conventional ones on the x-axis (pedigree). There’s an obviously correlation, but observe the slight bias toward values above the line of best fit, and the fact that the y values are higher than the x. Genomic estimates capture common ancestry which lay outside the purview of conventional genealogy!

The implications of these patterns are two-fold: first, looking backward toward human history, and second, forward toward biomedical science. Patterns of ROH here are roughly in line with a serial bottleneck model Out of Africa; the further populations are from Africa the more short ROH they have. African populations have the least of these because of their larger long term effective population size, and relative insulation from the bottlenecking process. A shorter term phenomenon is that of consanguineous marriage patterns, whether conscious and culturally normative (as in the the Muslim world and parts of South Asia), or due to demographic constraint, as is the case among hunter-gatherers. These two processes together are relevant because of the prominence of recessive diseases within the domain of medical genetics. Clearly very long ROH is a sign of inbreeding, and so a likely higher susceptibility of an individual to a host of ailments. But the authors note that the sum effect of many short ROH may also be problematic, especially due to the fact that these together may form the preponderance of the ROH within the genomes of many populations.

So far I’ve basically alluded to demographic history, and how it shapes the genome through processes which are fundamentally neutral and stochastic. Inbreeding itself can be thought of as a form of super-charged drift, as the long term effective population of a breeding group collapses in on itself. But what about natural selection? I decided to take a closer look at Dr. Daniel MacArthur of Genomes Unzipped ROH. One of his longest regions is on Chromosome 2, is about ~2 Mb in length, and runs from position 134606441 to position 136593184. In 23andMe there’s a position which I think might explain this: 136325116. That’s the number for rs4988235 in the 23andMe data file. Variation on this SNP tracks lactase persistence in Europeans. Dr. Daniel MacArthur has the genotype for lactase persistence in the homozygote form. Are we seeing the long haplotype associated with lactase persistence here in this long ROH which rose rapidly in frequency in the last 10,000 years because of natural selection? In general the parameters outlined in the paper satisfy the broad sketch of human history, but there may be interesting detail on the margins left out of the picture.

Finally, let’s go back to heterozygosity vs. homozygosity. I recently watched the documentary “Is it Better to be Mixed Race?” Setting aside the obvious reality that this sort of program reflects the Zeitgeist of the era (it is rather obvious that a Victorian scientist could have produced a different documentary, even with the same evidence), near the end there is a comparison of ROH across populations and individuals. The comparison was actually done by the research group which published the paper I just reviewed. If you jump to 38 minutes into the film and just watch they’ll lay out the results, but I’ll tell you what they found. They compared two European men, a South Indian woman, and a man whose father was English and mother Nigerian. The European men had expected levels of homozygosity; on the higher end. The South Indian woman had lower levels of aggregate homozygosity. This should be expected, as India is relatively genetically diverse on a pan-Eurasian scale. Finally, the mixed race male had almost no homozygosity to speak of. The principle investigator admitted that out of 5,000 individuals who had he tested and analyzed this was the most extreme result, and he had to recheck it. Why? Three factors:

- The mother is Nigerian, which is a population which is relatively genetically diverse

- The genetic distance between the father and mother is rather high

- Finally, because the man is a first generation hybrid on all the loci where Africans and Europeans tend to differ he’ll be much more likely to be heterozygous

I’ll let the authors have the last word:

Long ROH are a neglected feature of our genome, which we have shown here to be universally common in human populations and to correlate well with demographic history. ROH are, however, only partially predictable from an individual’s background (due to the stochastic nature of inheritance). As well as conferring susceptibility to recessive Mendelian diseases, ROH are also potentially an underappreciated risk factor for common complex diseases, given the evidence for a recessive component in many complex disease traits…they will allow quantification of the risk arising from recessive genetic variants in different populations.

Citation: Mirna Kirin, Ruth McQuillan, Christopher S. Franklin, Harry Campbell, Paul M. McKeigue, & James F. Wilson (2010). Genomic Runs of Homozygosity Record Population History and Consanguinity PLoS ONE : 10.1371/journal.pone.0013996

Image Credit: Allison Stillwell

November 8, 2010

What intra- & inter- population genetic variance tells us

Filed under: Admixture,Dodecad,Genetic History,Genetics,Genomics,History,Select Post — Razib Khan @ 2:30 pm

uyafrThe figure to the left is a composite merged from two different papers. One analyzes the patterns of genetic variation within African Americans, and the other the patterns within the East Turkic ethnic group, the Uyghurs. The bar plots show the ancestral element which is similar to two parent populations which resemble Europeans and Africans or East Asians. Looking at total aggregate ancestral quanta we infer that African Americans are on the order of 15-25% European in ancestry, and 75-85% African. Uyghurs seem to be a composite in even measure of a European-like group, and an East Asian-like group. This makes total sense phenotypically; most African Americans look more African, while Uyghurs seem to exhibit a phenotype on average which spans the middle-range between West and East Eurasians.

Central_Asian_Buddhist_MonkBut we’re clearly missing something when we focus purely on a population level statistic. Each “slice” of the bar plot actually represents an individual. Note the contrast between African Americans and Uyghurs. There is relatively little intra-individual variation among Uyghurs, while there is a great deal of such variation among African Americans. Why? Population geneticists have looked at linkage disequilibrium in both African Americans and Uyghurs, and inferred that the former went through an admixture phase much more recently than the latter. Though you don’t really have to be a population geneticist to have known that about African Americans. The ethnogenesis of the group African Americans as a cultural entity occurred in the period between 1650 and 1850. Genetically they are a compound of African, European, to some extent Native American, ancestry. For the Uyghurs we have thinner textual evidence, but the visual and genetic data point to a “western” Indo-European speaking population in the Tarim basin before the arrival of the Turks sometime in the second half of the first millenium A.D. The assumption is that after the initial admixture event and the absorption of the pre-Turkic substrate there was no population substructure. Over time the two components distributed themselves evenly across the population over a period of 1,000-1,500 years.

From this we can infer that patterns of individual variation within populations, as well as between closely related populations, can tell us a great deal. Today the Dodecad Ancestry Project posted a file with the population ancestries broken down by individuals. Looking at this sort of fine-grained data patterns can jump out based on what you already know. Below is a slide show I created which highlights some patterns of interest.

The first slide is confirmation of what we already know, or should expect. The Burusho are a linguistic isolate in the mountains of northern Pakistan. Their lack of inter-individual variation within the population is suggestive of long term isolation, as is common in mountainous regions. The very fact that they speak a linguistic isolate should lead us to expect this, as the flow of culture and genes often correlate. The Sindhi are the dominant Indo-Aryan speaking ethnic group of the lower Indus watershed. Because of their geographic position they have been conquered many times, being under Persian, Arab, and Turkic rule. Genetically they’re very similar to the Burusho, but observe that there are two individuals with substantial West African ancestry. The presence of black Africans in the armies of the Muslims who conquered the subcontinent is well known, and the origin of the Indian Siddi community. Some of the Sindhis also have appreciable ancestral components which are probably derived from Muslims from West Asia, the “Southern European” and “Southwest Asian” ancestral element which the Burusho lack.

418px-Kim_Kardashian_6Next you see a comparison between Assyrians and Armenians. These two groups seem very similar, and both have deep textually attested roots in the Middle East. The Armenians date to the Persian Empire, at least, while the Assyrians are clearly the descendants of the indigenous Semitic population of Mesopotamia before the arrival of the Arabs. In the Muslim period many of them retreated to mountainous areas of northern Iraq, before emigrating to the cities of modern Iraq with the relaxation of their status as marginalized dhimmis. Today the Assyrian community is scattered across the world. The portion which adheres to the Church of the East was nearly totally extirpated from Iraq early in the 20th century, while that which is in union with the Roman Catholic Church, the Chaldeas, is currently leaving Iraq en masse.

But the Armenians are a far different case in terms of their interactions with the rest of the world. They have been present as “middlemen minorities” as far east as Southeast Asia, and north into the Russian Empire, and south into the Muslim world. The most parsimonious explanation for the individuals with Northern European ancestry is that like Kim Kardashian they are products of mixed-marriages, but I wonder if the centuries of the Armenian Diaspora has resulted in a change in the gene frequencies in the Armenian homeland in part because of back-migration. With larger data sets this will be testable, as well as the hypothesis that Diaspora communities are admixed while the Armenians in Armenia proper are not.

The third slide compares Scandinavians, Finns, and Lithuanians. Scandinavia refers to the Germanic speaking lands of Norden. Lithuania has historically been just outside the arc of Nordic influence (in contrast to Estonia and Latvia), so it can serve as a Northern European control. I believe some of the Finnish samples in Dodecad are related, so one shouldn’t make too much of them. But, contrast the relatively constant level of Southern European in the Scandinavian samples, and their variance in the Finnish ones. Inversely, the Finns show relative constancy of the “Northeast Asian” proportion, while the Scandinavians vary, with some lacking it. This is likely evidence of recent population exchange, and cultural switching. Finland was under Swedish rule for most of the past 1,000 years, and there still remains a large ethnic Swedish population in Finland, and an ethnic Finnish population in Sweden. Some families in Finland likely switched from Finnish to Swedish to Finnish within the last 500 years. The Southern European and West Asian elements more prominent in the Scandinavians tend to increase as one goes south in Europe, with the former modal in Sardinia (in fact, Sardinians are nearly fixed for the Southern European component), and the latter more prominent among southeast European groups. Geography may then explain why the Lithuanians have similar amounts of the West Asian, but less of the Southern European.

UygurFinally we compare Turks, Greeks, and Cypriots. The historical ethnography strongly implies that the major component of Anatolian Turkish ancestry is Greek and Armenian. A broad similarity to the Greeks here is rather clear (with an elevated West Asian component probably from the Armenian ancestry). But notice the differences. There is a consistent East and Northeast Asian component of ancestry among Turks which is lacking in the Greeks. Since the origin of the Turks is in what would today be termed Greater Mongolia, this makes sense. What surprised me though is the presence of a South Asian component among the Turks. This is where looking at individual level results yields results; I’d assumed that like the Romanians the South Asian element was due to a few assimilated Roma. That seems unlikely now, it’s too evenly distributed. So what then? I think here looking at the Uyghur plot illuminates this for us. I don’t know what to make of the South Asian component which you can find in the Uyghur, and even to a trace extent, but again consistent, among the Chuvash, who inhabit the South Urals. Some readers have long claimed that some of the West Eurasian Uyghur ancestry was somehow connected to South Asia, and to be honest I’ve kind of seen that in other HGDP bar plots, but ignored it as of secondary importance. The Turkic group to the north and east of the Uyghurs, the Yakut, totally lack it. From what little we know it seems that the Turks pushed west to Europe, the Middle East, and South Asia, via what is today Xinjiang and Kazakhstan. The existence of this South Asian element in the Turks of Anatolia may be because of their sojourn in this region. There were Iranian speaking Indo-Europeans in Xinjiang, and certainly in Central Asia. Additionally we know historically that northwest India was connected to Xinjiang culturally, as some Indians arrived in China after a period of residence in Xinjiang. But instead of an “Out of South Asia” event I think what we may be looking at is part of the old “Ancient North Indian” genetic variation which pushed into South Asia from the north, and was eventually overlain in Central Asia with other components. I had assumed that the South Asian component among the Finns was noise or Kale, but perhaps it could be that.

Then there is Cyprus. Today the island is ethnically divided between Greek Cypriots and Turkish Cypriots. But in the Bronze Age Cyprus seems to have had a civilization with a close connection with the Near East, in particular Egypt. Sometime between the Bronze Age and the Classical Era it became an outpost of Greece. But notice the near total absence of Northern European among the Cypriots. Like the people of Sardinia, but unlike Sicily, Cyprus is relatively far from the Eurasian mainland. So how did Cyprus become Greek? If the Greeks always had a noticeable Northern European component, or at least during the Bronze Age, that would indicate that the Cypriots are a case of cultural diffusion and emulation of a small Greek elite which arrived during the migrations of the Sea Peoples. Or, the Northern European element could be due to admixture with the Slavic peoples who arrived in Greece after the collapse of East Roman frontier in the 6th century. Or it could be a combination of both. In any case, the Cypriots look most like the Syrians genetically, though the Syrians seem to have a lot more trace exogenous components.

There’s a lot more one could say. I invite readers to download the RAR file with the bar plots. I will leave you with one last comparison, without comment:


Image Credit: Tocharian Buddhist monk of European appearance, and Kim Kardashian, by Luke Ford

October 18, 2010

Sex with thee and the last woman

Male_and_female_pheasantA quintessentially sexy topic in biology is the origin of sex. Not only are biologists interested in it, but so is the public. Of Matt Ridley’s older books it is predictable that The Red Queen has the highest rank on Amazon. We humans have a fixation on sex, both in our public norms and our private actions. Why?

Because without a fixation on sex we would not be here. Celibates do not inherit the earth biologically. This answer emerges naturally from a Darwinian framework. And yet more deeply still: why sex for reproduction? Here I allude to the famous two-fold cost of sex. In dioecious species you have males and females, and males do not directly produce offspring. The increase of the population is constrained by the number of females in such lineages (male gametes are cheap). There is no such limitation in asexual lineages, where every individual can contribute to reproductive “primary production.” Additionally, the mating dance is another cost of sex. Individuals expend time and energy seeking out mates, and may have to compete and display for the attention of all. Why bother?

ResearchBlogging.orgThe answer on the broadest-scale seems to be variation. Variation in selective pressures, and variation in genes. Sex famously results in the shuffling of genetic permutations through recombination and segregation. In a world of protean change where one’s genes are critical to giving one the edge of fitness this constant flux of combinations results in more long term robusticity. What clones gain in proximate perfection, they lose when judged by the vicissitudes of the pressures of adaptation. In the present they flourish, but in the future they perish. Sex is the tortoise, clonal reproduction is the hare.

And yet science is more than just coarse generalities; biology especially so. The details of how sex emerges ad persists still remains to be fleshed out. The second volume of W. D. Hamilton’s collected papers, Narrow Roads of Gene Land, is the largest. Mostly because it was not edited appropriately (he died before it could be). But also perhaps because it is the volume most fixated upon the origin and persistence of sex, which is a broad and expansive topic.

A new paper in Nature tackles sex through experimental evolution. In may ways the answer it offers to the question of sex is old-fashioned and straightforward. Higher rates of sex evolve in spatially heterogeneous environments:

The evolution and maintenance of sexual reproduction has puzzled biologists for decades…Although this field is rich in hypotheses…experimental evidence is scarce. Some important experiments have demonstrated differences in evolutionary rates between sexual and asexual populations…other experiments have documented evolutionary changes in phenomena related to genetic mixing, such as recombination…and selfing…However, direct experiments of the evolution of sex within populations are extremely rare…Here we use the rotifer, Brachionus calyciflorus, which is capable of both sexual and asexual reproduction, to test recent theory…predicting that there is more opportunity for sex to evolve in spatially heterogeneous environments. Replicated experimental populations of rotifers were maintained in homogeneous environments, composed of either high- or low-quality food habitats, or in heterogeneous environments that consisted of a mix of the two habitats. For populations maintained in either type of homogeneous environment, the rate of sex evolves rapidly towards zero. In contrast, higher rates of sex evolve in populations experiencing spatially heterogeneous environments. The data indicate that the higher level of sex observed under heterogeneity is not due to sex being less costly or selection against sex being less efficient; rather sex is sufficiently advantageous in heterogeneous environments to overwhelm its inherent costs…Counter to some alternative theories…for the evolution of sex, there is no evidence that genetic drift plays any part in the evolution of sex in these populations.

sexthee0I’m not too familiar with B. calyciflorus, but it seems that it is facultatively sexual. Given the appropriate environmental cues (high densities, quorum sensing) some females can produce offspring which can have sex. The image to the left is from supplements, and shows the potential life cycles of this organism. Amictic in this context means individuals who produce diploid eggs which can not be fertilized. These eggs give rise to females parthenogenetically. The divergence between the two is when amictic females produce mictic females. These females produce eggs which are haploid, and can be fertilized. Those which are fertilized produce amictic females. Those which are not fertilized produce males.

Apparently in this species a propensity toward producing mictic females under stress conditions is heritable. Therefore, a propensity toward greater or less sexuality is heritable. There are within a give population both sexually and asexually reproducing individuals. Unlike humans, or bdelloid rotifers, B. calyciflorus is not locked into a particular style of reproduction, but can shift its strategy conditionally upon changes in the environment. Therefore it is an ideal organism upon with to test theories of the origin and maintenance of sex.  For them sexual reproduction is a option, and insight can be gained by exploring the conditions under which that option is exercised.

The two parameters they shifted in this experiment was the quality of nutrition (high vs. low) and the rate of migration within a set of populations (~1% vs. ~10%), for which the N was ~10,000. There were two treatments:

- Homogeneous environments of high-quality and low-quality food

- Heterogeneous environments where high and low-quality food zones existed adjacent to each other with two populations

The populations within these treatments were derived from wild lineages with a relatively high proportion of sexually reproducing individuals. Previous work confirmed that sexual reproduction, or propensity to reproduce sexually, was heritable. So if the environment favored sexuality or asexuality the frequencies should change over time as there is heritable variation for the trait within the rotifer populations. In other words, sex could be a target of natural selection.

In the figure below you see two panels. The first, a, shows populations subject to 10% transfer per generation. The second, b, 1% transfer per generation. This is the migration parameter, which is an order of magnitude higher in the first than the second panel. Triangles are heterogeneous environments, while circles represent homogeneous ones. The x-axis is the time parameter. At weak 14, the vertical line, all populations were mixed together and reassigned.


It’s immediately obvious that the proportion of sexually reproducing organisms is dropping rapidly in the homogeneous environments vis-a-vis the heterogeneous environments. Interestingly the shift in the migration parameter does not have much of an effect. In the first 14 weeks the propensity for sex drops even in the heterogeneous environment from the wild-type baseline. But once the lineages are mixed together and allowed to evolve from their laboratory baseline you see that sex has a positive benefit in the heterogeneous environment, shifting back up to an equilibrium state.

The authors note that the equilibrium propensity for sexual reproduction of rotifers seems higher in the wild than in the laboratory. That doesn’t seem so surprising, presumably there are many more variables which shift in the wild than in the laboratory, where conditions are consciously controlled to tease out independent predictors. The most common model for the maintenance of sex today in terms of the ultimate driver is host-pathogen co-evolution. Sex being the only way that slow-reproducing complex organisms can keep up with prolific asexual pathogens. The rotifers may be subject to this dynamic, as well as spatial heterogeneity. It does not seem to me that nature should be in the business of enforcing a monopoly on the supply of proteanism.

The_Madonna_in_SorrowWhat does this mean in the long-term? Well, it may be that sex, and males, are adaptations to an unpredictable and wild world whose caprice we can not account for. As humanity, or perhaps more generally sentient beings, begin to control nature and buffer themselves artificially from the volatile fluctuations, will we need sex and males? At the end of history when conditions are stable, and all that is before us is the terminus of heat death, perhaps what awaits us are a series of mindless and boring clonal lineages, perfectly adapted to turning nutrients into flesh, generation to generation.

Citation: Becks L, & Agrawal AF (2010). Higher rates of sex evolve in spatially heterogeneous environments. Nature PMID: 20944628

Image Credit: ChrisO, Wikimedia Commons

October 4, 2010

The adaptive space of complexity

evocomplexEvolution means many things to many people. On the one hand some scholars focus on time scales of “billions and billions,” and can ruminate upon the radical variation in body plans across the tree of life. Others put the spotlight on the change in gene frequencies on the scale of years, of Ph.D. programs. While one group must glean insight from the fossil remains of trilobites and ammonites, others toils away in dimly lit laboratories breeding nematodes and fruit flies, generations upon generations. More recently a new domain of study has been focusing specifically on the arc of animal development as a window onto the process of evolution. And so forth. Evolution has long been dissected by an army of many specialized parts.

ResearchBlogging.orgAnd yet the core truth which binds science is that nature is one. No matter the disciplinary lens which we put on at any given moment we’re plumbing the same depths on some fundamental level. But what are the abstract structures of those depths? Can we project a tentative map of the fundamentals before we go exploring through observation and experiment? That’s the role of theoreticians. Charles Darwin, R. A. Fisher, and Sewall Wright. Evolution is a phenomenon which is on a deep level an abstraction, though through objectification we speak of it as if it was as concrete as the frills of the Triceratops. As an abstraction it is open to mathematical formalization. Models of evolution may purport to tell us how change over time occurs in specific instances, but the ultimate aim is to capture the maximum level of generality possible.

Though the original mathematical theoreticians of evolution, in particular R. A. Fisher and Sewall Wright, were critical in the formation of the Modern Neo-Darwinian Synthesis, their formal frameworks were not without critics from within the mainstream. Ernst W. Mayr famously rejected “beanbag genetics,” the view propounded specifically by R. A. Fisher and J.B. S. Haldane in England that a model of evolution could be constructed from singular genetic elements operating independently upon traits. Mayr, as an ecologist and naturalist, believed that this framework lacked the essential integrative or holistic aspect of biology as it manifested in the real world. Selection after all operated proximately on the fitness of the whole organism. We’ve come a long way since those debates. One of the problems with the earlier disputes is that they were not sufficiently informed by the empirical evidence because of the primitive nature of experimental and observational evolutionary biology. Molecular biology changed that, and now the rise of genomics has also become a game changer. Genomics gets at the concrete embodiment of evolutionary change at its root, the structure and variation of the genomes of organisms.

A new paper in PNAS is a nice “mash-up” of the old and the new, Genomic patterns of pleiotropy and the evolution of complexity:

Pleiotropy refers to the phenomenon of a single mutation or gene affecting multiple distinct phenotypic traits and has broad implications in many areas of biology. Due to its central importance, pleiotropy has also been extensively modeled, albeit with virtually no empirical basis. Analyzing phenotypes of large numbers of yeast, nematode, and mouse mutants, we here describe the genomic patterns of pleiotropy. We show that the fraction of traits altered appreciably by the deletion of a gene is minute for most genes and the gene–trait relationship is highly modular. The standardized size of the phenotypic effect of a gene on a trait is approximately normally distributed with variable SDs for different genes, which gives rise to the surprising observation of a larger per-trait effect for genes affecting more traits. This scaling property counteracts the pleiotropy-associated reduction in adaptation rate (i.e., the “cost of complexity”) in a nonlinear fashion, resulting in the highest adaptation rate for organisms of intermediate complexity rather than low complexity. Intriguingly, the observed scaling exponent falls in a narrow range that maximizes the optimal complexity. Together, the genome-wide observations of overall low pleiotropy, high modularity, and larger per-trait effects from genes of higher pleiotropy necessitate major revisions of theoretical models of pleiotropy and suggest that pleiotropy has not only allowed but also promoted the evolution of complexity.

The basic thrust of this paper is to test older theoretical models of evolutionary genetics and their relationship and dependence on pleiotropy against new genomic data sets. In The Genetical Theory of Natural Selection R. A. Fisher proposed a model whereby all mutations affect every trait, and the effect size of the mutations exhibited a uniform distribution. Following in Fisher’s wake the evolutionary geneticist H. Allen Orr published a paper ten years ago, Adaptation and the cost of complexity, which argued that “…the rate of adaptation declines at least as fast as n-1, where n is the number of independent characters or dimensions comprising an organism.” This is the “cost of complexity,” which lay at the heart of this paper in PNAS.

To explore these questions empirically the authors looked at five data sets:

- yeast morphological pleiotropy, is based on the measures of 279 morphological traits in haploid wild-type cells and 4,718 haploid mutant strains that each lack a different nonessential gene (this also yielded quantitative measures)

- yeast environmental pleiotropy, is based on the growth rates of the same collection of yeast mutants relative to the wild type in 22 different environments

- yeast physiological pleiotropy, is based on 120 literature-curated physiological functions of genes recorded in the Comprehensive Yeast Genome Database (CYGD)

- nematode pleiotropy, is based on the phenotypes of 44 early embryogenesis traits in C. elegans treated with genome-wide RNA-mediated interference

- mouse pleiotropy, is based on the phenotypes of 308 morphological and physiological traits in gene-knockout mice recorded in Mouse Genome Informatics (MGI)

pleio1The first figure shows the results of the survey. You see in each data set the mean and median number of traits affected by mutations on a given gene, as well as the distribution of effects. Two conclusions are immediately evident, 1) most genes have a relationship only to a small number of traits, 2) very few genes have a relationship to many traits. You also see the percentages of genes impacted by pleiotropy is rather small. This seems to immediately take off the table simplifying assumptions of a mutant variant producing changes across the full range of traits in a complex organism. Additionally the effects do not seem to exhibit a uniform distribution; rather, they’re skewed toward genes which are minimally or trivially pleiotropic. From the text:

Our genome-wide results echo recent small-scale observations from fish and mouse quantitative trait locus (QTL) studiies…and an inference from protein sequence evolution…and reveal a general pattern of low pleiotropy in eukaryotes, which is in sharp contrast to some commonly used theoretically models…that assume universal pleiotropy (i.e., every gene affects every trait)

So if the theoretical models are wrong, what’s right? In this paper the authors argue that it seems as if pleiotropy has a modular structure. That is, mutations tend to have impacts across sets of correlated traits, not across a random distribution of traits. This is important when we consider the fitness implications of mutations, for if the impacts were not modular but randomly distributed the putative genetic correlations which would more likely serve as dampeners on directional change in trait value.

Figure 2 shows the high degree of modularity in their data sets:


pleio3Now that we’ve established that mutations tend to have clustered effects, what about their distribution? Fisher’s original model postulated a uniform distribution. The first data set, the morphological characteristics of baker’s yeast, had quantitative metrics. Using the results from 279 morphological traits they rejected the assumption of a uniform distribution. In fact the distribution was closer to normal, with a central tendency and a variance about the mode. Second, they found that standard deviations of effect sizes varied quite a bit as well. Many statistical models assume invariant standard deviations, so it is not surprising that that was the initial assumption, but I doubt many will be that surprised that the assumption turns out not to be valid. The question is: does this matter?

Yes. Within the parameter space being explored one can calculate distances which we can use to measure the effect of mutations. Panels C to F show the distances as a function of pleiotropic effect. The left panels are Euclidean distances while the right panels are Manhattan distances. The first two panels show the outcomes from the parameter values generated from their data sets. The second two panels use randomly generated effect sizes assuming a normal distribution. The last two panels use randomly generated effect sizes, and, assume a constant standard deviation (as opposed to the empirical distribution of standard deviations which varied).

To connect these empirical results back to the theoretical models: there are particular scaling parameters, the values of which the earlier models assumed, but which can now be calculated from the real data sets. It turns out that the empirical scaling parameter values differ rather significantly from the assumed parameter values, and this changes the inferences one generates from the theoretical models. The empirically calculated value of b = 0.612, as an exponent on the right hand side of the equation which generates the distances within the parameter space. From the text: “the invariant total effect model…assumes a constant total effect size (b = 0), whereas the Euclidian superposition model…assumes a constant effect size per affected trait (b = 0.5).” Instead of looking at the number value, note what each value means verbally. What they found in the empirical data was that there was variant effect size per affected trait. In this paper the authors found larger per-trait effects for genes affecting more traits, and this seems to be a function of the fact that b > 0.5; with a normal distribution of effect sizes and a variance in the standard deviation of effect sizes.

This all leads us back to the big picture question: is there cost of complexity?Substituting in the real parameters back into the theoretical framework originated by Fisher, and extended by H. Allen Orr and others, they find that the cost of complexity disappears. Mutations do not effect all traits, so more complex organisms are not disproportionately impacted by pleiotropic mutations. Not only that, the modularity of pleiotropy likely decreases the risk of opposing fitness implications due to a mutation, since similar traits are more likely to be similarly effected in fitness. These insights are summarized in the last figure:


The one to really focus on is panel A. As you can see there is a sweet spot in complexity when it comes to the rate of adaptation. Contra earlier models there isn’t a monotonic decrease in the rate of adaptation as a function of complexity, but rather an increase until to an equipoise, before a subsequent decrease. At least within the empirically validated range of the scaling exponent. This is important because we see complex organisms all around us. When theory is at variance with the observational reality we are left to wonder what the utility of theory is (here’s looking at your economists!). By plugging empirical results back into the theory we now have a richer and more robust model. I will let the authors finish:

First, the generally low pleiotropy means that even mutations in organisms as complex as mammals do not normally affect many traits simultaneously. Second, high modularity reduces the probability that a random mutation is deleterious, because the mutation is likely to affect a set of related traits in the same direction rather than a set of unrelated traits in random directions…These two properties substantially lower the effective complexity of an organism. Third, the greater per-trait effect size for more pleiotropic mutations (i.e., b > 0.5) causes a greater probability of fixation and a larger amount of fitness gain when a beneficial mutation occurs in a more complex organism than in a less complex organism. These effects, counteracting lower frequencies of beneficial mutations in more complex organisms…result in intermediate levels of effective complexity having the highest rate of adaptation. Together, they explain why complex organisms could have evolved despite the cost of complexity. Because organisms of intermediate levels of effective complexity have greater adaptation rates than organisms of low levels of effective complexity due to the scaling property of pleiotropy, pleiotropy may have promoted the evolution of complexity. Whether the intriguing finding that the empirically observed scaling exponent b falls in a narrow range that offers the maximal optimal complexity is the result of natural selection for evolvability or a by-product of other evolutionary processes…requires further exploration.

Citation: Wang Z, Liao BY, & Zhang J (2010). Genomic patterns of pleiotropy and the evolution of complexity. Proceedings of the National Academy of Sciences of the United States of America PMID: 20876104

Image credit: Moussa Direct Ltd., http://evolutionarysystemsbiology.org

September 28, 2010

To gain pallor is easier than losing it


John Hawks illustrates what can be gained at the intersection of old data and analysis and new knowledge, Quote: Boyd on New World pigmentation clines:

I’m using some statistics out of William Boyd’s 1956 printing of Genetics and the Races of Man[1]. It gives a good accounting of blood group data known more than fifty years ago, which I’m using to illustrate my intro lectures. Meanwhile, there are some interesting passages, from the standpoint of today’s knowledge of the human genome and its variation.

On skin pigmentation – this is the earliest statement I’ve run across of the argument that the New World pigmentation cline is shallower than the Old World cline because of the relative recency of occupation….

Looking at what was said about pigmentation generations ago is of interest because it’s a trait which in many ways we have pegged. See Molecular genetics of human pigmentation diversity. Why humans vary in pigmentation in a deep ultimate sense is still an issue of some contention, but how they do so, and when the differences came about, are questions which are now modestly well understood. We know most of the genetic variants which produce between population variation. We also know that East and West Eurasians seem to have been subject to independent depigmentation events. We also know that some of the depigmentation was relatively recent, probably after the Last Glacial Maximum, and possibly as late as the advent of agriculture.

On the New World cline, which is clearly shallower than that of the Old World. The chart below from Signatures of positive selection in genes associated with human skin pigmentation as revealed from analyses of single nucleotide polymorphisms is useful:

skinvarianceWhat you’re seeing here are patterns of relationships by population when it comes to the select subset of genes which we know are implicated in between population variation in pigmentation. The peoples of Melanesia are arguably the darkest skinned peoples outside of Africa (and perhaps India), and interestingly they are closer to Africans than any other non-African population. But in total genome content they’re more distant from Africans than other non-African populations, excluding the peoples of the New World.

This disjunction between phylogenetic relationships when looking at broad swaths of the genome, as opposed to constraining the analysis to the half a dozen or so genes which specifically encode between population differences on a specific trait, is indicative of selection. In this case, probably functional constraint on the genetic architecture. From the reading I’ve done on skin pigmentation genetics there is an ancestral “consensus sequence” on these genes which result in dark complexions. In contrast, as has been extensively documented over the last few years there are different ways to be light skinned. In fact, the Neandertals which have been sequenced at those loci of interest also turn out to have a different genetic variant than modern humans.

How to explain this? I think here we can go back to our first course in genetics in undergrad: it is easier to lose function than gain function. The best current estimate is that on the order of one million years ago our species lost its fur, and developed dark skin. And it doesn’t look like we’ve reinvented the wheel since that time. All of the peoples termed “black” across the world, from India, to Australasia, to Africa, are dark because of that ancestral genetic innovation. In contrast, deleterious mutations which “break” the function of the genes which gave some of us an ebony complexion occur relatively frequently, and seem to have resulted in lighter skinned groups in more northerly climes. It turns out that some of the pigmentation genes which are implicated in between population variance in complexion were actually originally discovered because of their role in albinism.

So how does this relate to the New World? I think the difficulty in gaining function once it has been lost explains why the people of Peru or the Amazon are not as dark skinned as those of Africa, Melanesia, or South Asia. They haven’t had enough time to regain function which they lost as H. sapiens traversed northern Eurasia. So there you have it. A nice little illustration of how the genetics taught to 18 year olds can be leveraged by the insights of modern genomics and biological anthropology! In the end, nature is one.

Image Credit: Dennis O’Neil

September 23, 2010

The city that kills you makes you strong!

ResearchBlogging.orgOver the past day I’ve seen reports in the media of a new paper which claims that long-term urbanization in a region is strongly correlated with genetic variants for disease resistance. I managed to find the paper on Evolution’s website as an accepted manuscript, ANCIENT URBANISATION PREDICTS GENETIC RESISTANCE TO TUBERCULOSIS:

A link between urban living and disease is seen in recent and historical records, but the presence of this association in prehistory has been difficult to assess. If the transition to urbanisation does result in an increase in disease-based mortality, we might expect to see evidence of increased disease resistance in longer-term urbanised populations, as the result of natural selection. To test this, we determined the frequency of an allele (SLC11A1 1729 + 55del4) associated with natural resistance to intra-cellular pathogens such as tuberculosis and leprosy. We found a highly significantly correlation with duration of urban settlement – populations with a long history of living in towns are better adapted to resisting these infections. This correlation remains strong when we correct for auto-correlation in allele frequencies due to shared population history. Our results therefore support the interpretation that infectious disease loads became an increasingly important cause of human mortality after the advent of urbanisation, highlighting the importance of population density in determining human health and the genetic structure of human populations.

298px-Pericles_Pio-Clementino_Inv269In some ways this seems plausible. There are a priori reasons why we’d expect to see a great deal of evolutionary change in regions of the genome correlated with variations in immune response. Diseases are one of the most likely reasons for why sex exists in complex multicellular species; sex allows a slow-reproducing population to bend with the rapid-fire punches of their pathogens by shuffling their defenses constantly. The results from recent work mapping patterns of variation in relation to natural selection generally indicate that immune related regions show plenty of signs of adaptation. No surprise, a “Red Queen” model whereby pathogens and their hosts constantly co-evolve would imply that immunologically relevant genes would never be at equilibrium frequencies for long, so we’d have a good shot at catching “selective sweeps” on some of the immune loci.

So how do cities play into this picture? I suspect that the picture is more complicated than the presentation in the paper, though I believe that the authors were constrained by considerations of space from evaluating all possibilities in full depth. There are two facts which I think are critical to understanding the pattern of variation here:

- All pre-modern societies were predominantly rural demographically. The difference between an “urban civilization” like Rome and a non-urban one such as Dark Age Ireland was that ~25% of the residents of the Roman Empire lived in urban areas (generously defined) while ~0% of Dark Age Irish lived in urban areas. Rome is generally considered to be a very urban pre-modern society, perhaps the most urban large-scale society before the 17th century.

- I also believe that ancient cities were population sinks. People simply did not replace themselves and cities only perpetuated their massive scale by serving as magnets for excess population from the rural hinterlands. Without appropriate political structures to maintain the population and generate incentives for an inflow migration ancient cities withered away very fast (Rome’s population went from hundreds of thousands to tens to of thousands in the 100 years from the early 6th to early 7th century because of political instability).

Before I go on much, let’s address the results presented in the paper. Below you see the frequencies of the allele which is more protective against tuberculosis in tabular form and on a map, as well as the logistic regressions which show the relationship between time since urbanization and the allele frequencies. Please note that they corrected for genetic relatedness in their regression, so the correlation isn’t just due to population stratification on a world wide scale.

Since the allele which confers resistance is at a high frequency everywhere the difference is between those populations where the genotypes are predominantly in a homozygote state (e.g., Iranians), and those where only around half are resistant homozygotes (e.g., Sami). The authors note that because of the high frequency everywhere, including populations with no history of agriculture such as the Sami, one can’t posit a model where positive selection drove the disease resistant alleles from 0 to fixation. Rather, it perturbed the equilibrium frequency. Using the Tajima’s D statistic they do find evidence of balancing selection in both East Asians and Europeans. This would be in keeping with frequency-dependent models of pathogen-host co-evolution.

As I said before there are strong reasons to assume that natural selection reshaped the genomes of populations over the past 10,000 years. It really isn’t if, it’s how and what. The authors present some evidence for a particular variant of the gene SLC11A1 being the target of natural selection. To really accept this specific case I think we’ll need some follow up research. Rather, I want to focus on the narrative which is being pushed in the media that cities were the adaptive environments which really drove the shift in allele frequencies. I don’t think this was the case, I think the cities were essential, but I don’t think ancient urbanites left many descendants. Instead, I think cities, or urbanization, is first and foremost a critical gauge of population density and social complexity. Second, I believe that cities serve as facilitators and incubators for plague. In other words both urbanization and disease adaptation are derived from greater population density, while urbanization also serves a catalytic role in the spread of disease. This could explain the strong correlation we see.

I believe that the Eurasians who may have been subject to natural selection due to the rise of infectious disease are almost all the descendants by and large of ancient rural peasants, or, their rentier elites. These peasants were subject to much greater disease stress even without living in urban areas than hunter-gatherers and pastoralists because their population densities were higher, and quite often they were living a greater proportion of their lives snuggly against the Malthusian lid. Hunter-gatherers may have been healthier on average because of a more diversified diet as well as lower population densities due to endemic warfare. In contrast, agriculturalists lived closely packed together and were far more numerous than hunter-gatherers, and, their immune systems were probably less robust because of the shift away from a mix of meat, nuts and vegetables, to mostly grains.

800px-Republik_Venedig_Handelswege01A downstream consequence of agriculture was the rise of cities through the intermediate result of much higher population densities. I accept the literary depiction of ancient cities as filthy and unhealthful. There’s almost certainly a reason that pre-modern elites idealized rustic life, and had country villas. Additionally, though I assume that both the rural peasantry and urban proletarian led miserable lives, I believe that in terms of reproductive fitness the former were superior to the latter. From what I have read city life only became healthier than rural life in the United States in 1900, in large part due to a massive public health campaign triggered by fear of immigrant contagion. The high mortality rates and low reproductive fitness of urbanites implies that evolutionarily the more important role of cities were as nexus points for trade and the spread of disease. The book Justinian’s Flea chronicles the pandemic in the Roman Empire in the 6th century, in particular its origin in Constantinople from points east. We’re well aware today that a globalized world means that there’s an interconnectedness which can bring us strength through comparative advantage, but also catastrophe through contagion. This is a general dynamic, not simply one applicable to disease, but in the world before modern medicine the utility of trade networks for pathogens would have been of great importance.

One can imagine societies through the organismic lens as if they were cyclical wind-up toys. In the initial stage of expansion and integration political stability and concentration of power results in a peace which allows for the increase in population as more land inputs are thrown into primary production. Eventually diminishing returns kicks in and there’s no more land, so the labor squeezes itself more tightly on fixed land endowments. Their median physiological fitness declines as the pie gets cut into more and more pieces. All the while these massive numbers of peasants serve as the source of revenue for extractive elites, who found and patronize cities where they can signal their status and concentrate their wealth. Most pre-modern cities, like Rome and Constantinople, would have been economic parasites, depending on rents and plunder. As a sidelight cities such as Constantinople which were placed at transportation hubs would also become the focal points for trade, especially if they could be termini themselves for the luxury good trade which was dependent on the demand from rentier elites in residence in the metropolis. Finally, these cities would also be magnets for masses of armies because of the inevitably of sieges.

400px-Plato_Silanion_Musei_Capitolini_MC1377Eventually the combination of factors would result in the outbreak of plague. Social order would collapse, people would flee the cities, and populations would drop as the tightly run ship on the Malthusian margin ran aground. As the population dropped median health and wealth would return, and susceptibility to plague would decrease. And then the cycle of expansion and integration would start anew.

This is I believe the story of the rise and fall of urban societies which reshaped the genomes of people who lived across much of Eurasia. It isn’t a tale of urbanites, rather, urbanites for most of history have almost certainly been epiphenomena in a genetic sense. They’re the excess rural population which finds its way to the polis. Because of the squalor and lack of public health the lot of the urbanite was to consign their genes to oblivion. But for this deal with the devil the urban man had an opportunity to become immortal, and live on in human memory. It is their names which echo down through history, and roll off the tongues of the descendants of the peasants who have long ago forgotten their own genetic forebears.

Citation: Barnes, I., Duda, A., Pybus, O., & Thomas, M. G. (2010). ANCIENT URBANISATION PREDICTS GENETIC RESISTANCE TO TUBERCULOSI Evolution : 10.1111/j.1558-5646.2010.01132.x

Image Credit: Marie-Lan Nguyen, Nikater

Simple rules for inclusive fitness

ResearchBlogging.orgWith the recent huge furor over the utility of kin selection I’ve been keeping a closer eye on the literature on inclusive fitness. The reason W. D. Hamilton’s original papers in The Journal of Theoretical Biology are highly cited is not some conspiracy, rather, they’re a powerful framework in which one can understand the evolution of social behavior. They are a logic whose basis is firmly rooted in the world of how inheritance and behavior play out concretely. But because of their formality and spareness inclusiveness fitness has also given rise to a large literature derived from simulations “in silico,” that is, evolutionary experiments in the digital domain.

375px-Green_Beard_GeneOne can elucidate inclusive fitness through Hamilton’s Rule, but it is also rather easy to exposit verbally via a “gene’s eye view.” Imagine for example a dominant mutation in a diploid organism which produces the behavior of altruism toward near kin. Initially the altruist will have offspring whose probability of carrying the dominant mutation is 50%, because there is also the probability that they will carry the ancestral non-altruistic variant. Imagine an altruistic behavior which incurs a small, but not trivial, cost to the individual performing the behavior, and a large gain to the individual who is on the receiving end of the altruism. The logic of favoring near kin is such that in the initial generation the parent which behaves altruistically toward near kin is increasing their own “inclusive fitness” because their offspring share 50% of their genes identical-by-descent (in the case of a diploid sexually reproducing organism). But from a gene’s eye perspective what is really occurring is that there is a 50% chance that the gene which fosters altruism is promoting the fitness of a copy of itself. So inclusive fitness operates by modulating the parameters of costs and gains to focal individuals as a function of their relatedness, but it is the genes, the “replicators,” which persist immortally across the generations. We “vehicles” are just the ocean through which genes sail.

But like Darwin’s theory of evolution through natural selection the fruit of these logics are in the details. A new paper in The Proceedings of the Royal Society puts the focus on different means by which inclusive fitness may be maximized. In particular, the paper offers up a reason for why what Richard Dawkins termed the “green-beard effect” is not more common. Selective pressures for accurate altruism targeting: evidence from digital evolution for difficult-to-test aspects of inclusive fitness theory:

Inclusive fitness theory predicts that natural selection will favour altruist genes that are more accurate in targeting altruism only to copies of themselves. In this paper, we provide evidence from digital evolution in support of this prediction by competing multiple altruist-targeting mechanisms that vary in their accuracy in determining whether a potential target for altruism carries a copy of the altruist gene. We compete altruism-targeting mechanisms based on (i) kinship (kin targeting), (ii) genetic similarity at a level greater than that expected of kin (similarity targeting), and (iii) perfect knowledge of the presence of an altruist gene (green beard targeting). Natural selection always favoured the most accurate targeting mechanism available. Our investigations also revealed that evolution did not increase the altruism level when all green beard altruists used the same phenotypic marker. The green beard altruism levels stably increased only when mutations that changed the altruism level also changed the marker (e.g. beard colour), such that beard colour reliably indicated the altruism level. For kin- and similarity-targeting mechanisms, we found that evolution was able to stably adjust altruism levels. Our results confirm that natural selection favours altruist genes that are increasingly accurate in targeting altruism to only their copies. Our work also emphasizes that the concept of targeting accuracy must include both the presence of an altruist gene and the level of altruism it produces.

Using the Avida software platform the researchers ran trials of the evolution of populations of artificial life which varied in fitness, coefficient of relatedness, as well as their phenotypes. In one set of trials the organisms operated through conventional means of kin selection, whereby the heuristic was to favor those to whom an individual was closely related. This will result in a fair amount of “false positives,” as everyone knows that near kin can be selfish and “cheat.” Remember that in the toy example above 50% of the offspring who will gain from altruism will themselves lack the altruism gene. A second set of organisms look to total genetic similarity. This is the sort of thing which humans could engage in if they had immediate knowledge of the genomic sequences of those around them. Even among near relatives genetic similarity is only correlated with, not perfectly correspondent with, coefficients of relatedness. Some full siblings may share more identity-by-descent than others. This is trivially obvious in the initial illustration, as there will be a great deal of intra-familial variance on the gene which produces altruism. To focus on the dynamics of the specific gene, the authors also looked at a green-beard effect, whereby a there is a correlation between altruism, a gene, and a visible phenotype. In other words, you know altruists by a correlated physical trait. If the correlation between a phenotype and a genotype is close enough you don’t need do a typing of their genome because you know the state of their genotype, and so have expectations as to whether they’re truly altruists or not. Presumably using the green-beard effect one could side-step the usage of kinship or relatedness as a proxy. In many cases those more distantly related could be more phenotypically similar on the traits of interest than those who are genetically closer.

What did they find? Figure 1 shows the outcomes of various sets of trials:


Their expectations were that in regards to the evolution of altruism kin selection should be inferior to genetic similarity which should be inferior to the green-beard effect. The reasoning is straightforward, as you progress across these sequence of dynamics the false positive rate of aiding those without the altruism conferring gene should decrease. That is not what they found, at least not initially.

What was happening is that they were focusing on the wrong parameters in framing their expectations. That’s why you run the model: human intuition often fails. Green-bearding is very precise as a dichotomous indicator of whether an individual carries a particular gene identical-by-descent, but mutation could produce variation in levels of altruism. What they found was that when green-bearding was dichotomous the levels of altruism tended to converge upon a lower equilibrium as individuals were focused on being just altruistic enough to count as real altruists and so gain advantages from those who were more generous. A concrete example of this would be an “affinity con”. An individual is a member of a group, and they leverage the trust which comes from being a member of the group to exploit the group. Baked into the cake of the original model is that altruists who also had a green-beard had to have donated at least once, and that is the target which green-beards converged upon. In contrast the strategy of genetic similarity resulted in greater donations, and because the model had non-zero sum dynamics (altruism increased everyone’s fitness greatly, though cheaters could exploit this to “free-ride”) the strategy which maximized donations was more successful. The researchers made green-bearding more competitive by simply increasing the donation threshold to match the equilibrium which emerged with the other strategies. So making all things equal the intuition about green-bearding was then vindicated.

Instead of setting a specific threshold there was another way that green-bearding could beat the other strategies to maximize inclusive fitness: vary the green-bearding trait and altruism continuously in a correlated fashion. In other words, the greener the beard, the more altruistic. This is a classic way that one could beat the cheaters: develop detection and discernment mechanisms. Why doesn’t this matter for the two other more “primitive” techniques? Kin selection and genetic similarity are more robust because they’re not fine-tuned, organisms with similar genome content are likely to have similar altruism levels. The genetic relatedness of altruists in green-bearding populations is going to be lower because they’re looking for a very specific genotype and its correlation with a phenotype. Green-bearding is more precise, but it’s also somewhat more complicated, and as a more precisely engineered solution it may not always be as robust.

And that necessity of fine-tuned intelligence in design may be why green-bearding is not more common. The authors note that in theory one could imagine mutations leading to concomitant variations of the magnitude of green-bearding and altruism in the same direction, but in a real evolutionary genetic context with normal parameters of mutation and effective population sizes this may not be plausible. Many people would argue that evolution is littered with kludges because natural selection makes recourse to “quick and dirty” solutions which are simple but effective, and kin selection and genetic similarity are closer to that than green-bearding. In theory selection may lead to a world of green-beards with infinite population sizes and generations, and persistent and consistent selection, but the world may be too protean for this optimal equilibrium to ever arise. So until then, we’ll make do with social evolution’s duct-tape: “I against my brother; my brother and I against my cousin; I, my brother, and my cousin against the stranger.”

Citation: Clune J, Goldsby HJ, Ofria C, & Pennock RT (2010). Selective pressures for accurate altruism targeting: evidence from digital evolution for difficult-to-test aspects of inclusive fitness theory. Proceedings. Biological sciences / The Royal Society PMID: 20843843

Image Credit: Burningrey

September 17, 2010

Of Iran, Turan, and Turks

uzbekmanThere’s a new paper out in The European Journal of Human Genetics which is of great interest because it surveys the genetic and linguistic affinities of two dozen ethno-linguistic groups from the three Central Asian nations of Uzbekistan, Kyrgyzstan, and Tajikistan. This is what the Greeks referred to as Transoxiana, and the Persians as Turan. Originally inhabited by peoples with close cultural affinities with those of Persia, indeed, likely the root of the peoples of Persia, by the historical period Turan developed a distinctive identity as a frontier or march. It was in Turan where the Turk met the Iranian (a class which included non-Persian groups, such as the Sogdians), from the pre-Islamic Sassanians down to the present day. It is a region of the world which has a very ancient urban culture, cities such as Merv, as well as peoples that were only recently nomads, forcibly made sedentary by the Soviet regime.

To add another twist to the picture many of the ethno-linguistic groups which we are familiar with today and which serve as the cores of the new Central Asian nations only came into being within the last few centuries, with a particular “push” from Russian Imperial and Soviet ethnologists who were tasked with fleshing out national identities with which the center could negotiate. A “Tajik” is after all simply part of the Persian-speaking residual population of Central Asia, spreading down into Afghanistan. The carving out of an independent Tajikistan out of the Central Asian landscape is as much a creation of the modern age as the state of Israel. The “Uzbek” identity was once simply that of the ruling caste of Transoxiana who came to power after the decline of the Timurids. Today it is an appellation which brackets the settled Turkic speaking peoples of Uzbekistan and beyond.

ResearchBlogging.orgInto this near Gordian knot of history and ideology walk the naive and well-meaning geneticists. There is no great objection one can make to the genetics within the paper, but the historical framework and some of the assertions are peculiar and tendentious indeed. It’s a problem which starts within the abstract. In the heartland of Eurasia: the multilocus genetic landscape of Central Asian populations:

Located in the Eurasian heartland, Central Asia has played a major role in both the early spread of modern humans out of Africa and the more recent settlements of differentiated populations across Eurasia. A detailed knowledge of the peopling in this vast region would therefore greatly improve our understanding of range expansions, colonizations and recurrent migrations, including the impact of the historical expansion of eastern nomadic groups that occurred in Central Asia. However, despite its presumable importance, little is known about the level and the distribution of genetic variation in this region. We genotyped 26 Indo-Iranian- and Turkic-speaking populations, belonging to six different ethnic groups, at 27 autosomal microsatellite loci. The analysis of genetic variation reveals that Central Asian diversity is mainly shaped by linguistic affiliation, with Turkic-speaking populations forming a cluster more closely related to East-Asian populations and Indo-Iranian speakers forming a cluster closer to Western Eurasians. The scattered position of Uzbeks across Turkic- and Indo-Iranian-speaking populations may reflect their origins from the union of different tribes. We propose that the complex genetic landscape of Central Asian populations results from the movements of eastern, Turkic-speaking groups during historical times, into a long-lasting group of settled populations, which may be represented nowadays by Tajiks and Turkmen. Contrary to what is generally thought, our results suggest that the recurrent expansions of eastern nomadic groups did not result in the complete replacement of local populations, but rather into partial admixture.

In my initial comment on this paper in a link round-up I wondered what the authors were thinking making such a comment: anyone who knows Central Asians would see on their faces that the Turks did not completely replace the local populations. The image above is of an Uzbek man, who does not exhibit any visible “Mongolian” features. This is not the norm, but is not unheard of. Even populations which are presumed to have less Iranian admixture, such as the Kazakhs, exhibit a range of physical types. It would be one thing if this reference was an isolated peculiarity, but there are other comments within the paper which indicate to me that the research group’s familiarity with the non-genetic literature is cursory at best. They refer to Huns as having “brought the East-Asian anthropological phenotype to Central Asia.” There is no clear definite foundation for this assertion. Unfortunately historians do not have a clear idea what the ethno-linguistic character of the Huns was. By the time Roman observers encountered them the Hunnic horde seems to have been predominantly German, with a Iranian (Alan) secondary component, the Huns themselves being a small elite (Attila’s name itself may be Gothic). In light of subsequent eruptions into Europe of Turkic and Ugric nomads it is easy to slot the Huns into this exotic category, but the primary literature makes it clear that you can’t ascertain their ethnic character from the contemporary sources (the “White Huns” of Central and South Asia had no real connection to the Huns of Europe).

Near the end of the paper they say something really peculiar: “The Westernized view of westward invasions usually emphasizes the extreme violence and cruelty of the hordes led by Attila the Hun (AD 406–453), or that from the Mongolian empire led by Genghis Khan. However, our results somehow challenge this view and rather suggest that these more recent expansions did not lead to the massacre and complete replacement of the locally settled populations….” It is true that European observers of the Mongol expansion did not have a sanguine attitude. But the idea that Mongols were genocidal exterminationists really comes to us via the Islamic historians, for whom the Mongol conquests were totally shocking and a literal world-turned-upside-down moment. The Mongol conquests did seem to result in a decline in population between Mesopotamia and Transoxiana. Whole cities in Central Asia were depopulated. There is an assumption that the Mongol conquests marks the turning point where Central Asia passed from being a predominantly Iranian world with a Turkic military elite (which was to be the nature of Iran proper until the 20th century) to a Turkic world with a large Persian minority. Though the military conquests of the Mongols were important punctuating events, I do not believe that scholars today would assume that they produced an ethnic shift in toto. On the contrary, the null hypothesis is generally against migrationism.

With those preliminaries out of the way, what’s going on with the genetics? Below are the less interesting tables & figures. The first is important because it has the abbreviations which they use. Basically anything that starts with a “T” are Indo-Iranian Tajiks, and everything else is Turkic, except LUzn LUza, who are Indo-Iranian Uzbek nationals, but I presume would be ethnic Tajiks in Uzbekistan (this stuff is really confusing in regards to labels, because as I said the national categories are to some extent ad hoc impositions on more ancient identities which don’t always follow the European language = nation formula). The second image is a figure which shows the sampling of locations, as well as pie charts with ancestral quanta. The third image is a table which shows that Indo-Iranians are genetically more varied than Turks. While the fourth is a STRUCTURE plot which I reedited to zoom in on peoples of interest for this study, as well removing some of the lower K’s. Remember that each K is a putative ancestral population. As Dienekes notes since they used only 27 microsatellite markers across their 26 populations, the plot may inflate minor ancestral contributions.

Of more interest is the correspondence analysis, which is conceptually similar to principal component analysis. The variate inputs are allele counts. I’ve obviously reedited the figure a bit, and added some labels (yeah, I ended up thinking that rotating after I’d added some labels was best, sorry). Note the clear color-coding of Turkic vs. Iranian Central Asian groups.


There’s a clear separation linguistically between Iranian speaking and Turkic speaking groups in Central Asia. Some of the Turkic groups are close to Iranian groups, closer than to other Turkic groups, but still the two broad sets have a coherent identity. Undergirding the linguistic variation is classical geographic variation. The eastern Turkic groups seem the least impacted by the Iranian substrate which was dominant before the arrival of Turks, while the Turcoman group sampled from western Uzbekistan seems to have been the most genetically “Iranized.” In a world wide context the central position of Central Asians is not surprising. Interestingly the Iranian groups of Central Asia seem to overlap rather well with the Indo-Iranian groups from the HGDP data set. In contrast, the Turkic groups are distributed along a linear axis from East Asians to the Iranian cluster. This is the same pattern evident among African Americans as individuals. It’s a two-way admixture, with different dosage degrees by population as a function of history and geography (I presume you’d see the same pattern if it was broken down on individuals with a SNP-chip).

admixMoving to the explicit admixture estimates, the labels leave something to be desired. The shaded area is for Turkic speakers. The very last group, TJY, indicates the Yagnobis of Dushanbe. I happen to know offhand that the Yagnobis are reputed to be descendants of the Sogdians, having preserved their language and Zoroastrian religion relatively late in history before switching to Tajik and Islam. Like many ethno-linguistic relics these people preserved their independent identity after the Arab conquest, which saw the decline of Sogdian influence on the Silk Road, by taking refuge in isolated regions. It is no surprise then that this group shows the least East Asian admixture of all the Iranian samples, as they were isolated from many of the social and historical processes which were operative in Transoxiana after the conquest by the Arabs, and the later pushing in of the zone of Turkic hegemony after the fall of the Samanids.

These admixture estimates definitely put the spotlight on the role of Central Asia as a nexus of sorts. In the archaeology and history it is clear that Central Asia has been affected by peoples of European, South Asian, Middle Eastern, and East Asian origin. Central Asia itself has been the mother of empires, famously the seat of Timur, but also the original base of what later became the Abbasid dynasty. At one point the Caliphate was split between western and eastern factions and there was a possibility that the capital would be relocated from Baghdad to the Central Asian city of Merv! I do not believe that the Arabs had a strong genetic impact, nor was there a large South Asian migration in recent periods into Central Asia. So the admixture estimates adduced for these groups may be due to the natural cline in allele frequencies which are found in different peripheral Eurasian populations. Frequencies which are naturally intermediate in Central Asia. The main caveat is that it is probable that local conditions will vary a great deal. In contrast we have strong reason to suspect that the East Asian component arrived relatively recently with the Turks, and we see that its aspect is most evident among the groups which were nomadic within living memory, the Kazakhs and Kyrgyz. These two ethnicities, which are really compounds of several tribes or “hordes,” were only marginally integrated into sedentary Islamic society where the Tajik element would be prominent (shamanism among many of these tribes only disappeared under the influence of the Islamic missionaries sponsored by Russian Empire). I think this pattern is reinforced by what we saw in the correspondence analysis, where the Turkic groups exhibited a linear distribution toward East Asia, while the Iranian ones were placed where you’d expect them geographically. Finally, I want to note that Dienekes observes that using South Asians as a Central Asian population source is strange since South Asia is more appropriately thought of as a demographic sink for Turan. True, but the HGDP populations are strongly biased toward groups with relatively little indigenous South Asian ancestry, with the Sindhi being the only Indo-Aryan speakers within the set. So I think that objection is mitigated by these factors. Rather, the Iranian-speaking Pakistani groups serve as proxies for the original Central Asian Iranian substrate, from which both they and the Tajiks presumably derive.

Moving back to the Turk vs. Iranian distinction, the authors note that the Turkic groups exhibit a strong degree of genetic homogeneity on the Y chromosomal lineages. This points to the possible manner in which the East Asian genetic element spread in Central Asia, not necessarily just through population displacement, but also through polygamy and the high reproductive fitness of particular “super-male” lineages. The children of elite Turkic men who took Iranian wives presumably adopted the culture of their fathers, including the linguistic identity. This may have been particularly easy in Central Asia because they did not have to repudiate their maternal heritage in totality, as Persian culture still had great status and currency. If we partition the ancestry into “East Eurasian” and “West Eurasian” components the Turkic groups have much more of the latter than the Iranian ones have of the former. That stands to reason as the Turks were newcomers, and an elite which the locals would wish to assimilate to if they had the opportunity. In contrast, the shift from Turk to Iranian may have been rarer, and a switch which individuals would wish to avoid since the latter did not have the same level of temporal power. Over ~1,500 years gene flow does occur between the groups, and even the Yagnobis have appreciable East Asian ancestry. Eventually the linguistic differences would probably be dwarfed by the geographical ones, but currently we’re taking a snapshot of a “transient.”

It’s complicated. And one has to be very careful about using terms like “Turk” in a localized context, vs. a more international one. The Turks of Turkey are overwhelming derived from the same source populations as their Balkan (because of Rumelian Turks), Iranian, and Armenian, neighbors. The decline in East Asian fraction is evident even in this sample, as the Turcomans from western Uzbekistan have the least eastern ancestry of any of the groups. But this paper is an excellent within into a critical geographical hinge of genetic variation and historical tumult (though one must set aside some of their tacked-on historical speculations).

Citation: Martínez-Cruz B, Vitalis R, Ségurel L, Austerlitz F, Georges M, Théry S, Quintana-Murci L, Hegay T, Aldashev A, Nasyrova F, & Heyer E (2010). In the heartland of Eurasia: the multilocus genetic landscape of Central Asian populations. European journal of human genetics : EJHG PMID: 20823912

Image Credit: Wikimedia

September 16, 2010

A fly’s life: adventures in experimental evolution

509px-Drosophila_residua_heNatural selection happens. It was hypothesized in copious detail by Charles Darwin, and has been confirmed in the laboratory, through observation, and also by inference via the methods of modern genomics. But science is more than broad brushes. We need to drill-down to a more fine-grained level to understand the dynamics with precision and detail, and so generate novel inferences which may then be tested. For example, there are various flavors of natural selection: stabilizing selection, negative selection, and positive directional selection. In the first case natural selection buffets the phenotype about an ideal mean, in the second case deleterious phenotypes and their associated alleles are purged from the genome, and finally, natural selection can also drive a novel trait toward greater prominence, and concomitantly the allelic variants which are associated with the fitter phenotype.

The last case is of particular interest to many because it is often with positive natural selection by which evolution as descent with modification occurs. Over time trait values and the nature of traits themselves shift such that a lineage changes its character beyond recognition. This phyletic gradualism and the scale independence of evolutionary process has been challenged, in particular from the domain of developmental biology (albeit, not all ,or even most, developmental biologists). But ultimately no one doubts that a classical understanding of evolution as change in allele frequency, often driven by natural selection, is part of the larger puzzle of how the tree of life came to be.

ResearchBlogging.orgOne of the phenomena associated with positive directional evolution is the selective sweep. How a selective sweep occurs, and its consequences, are rather straightforward. A genome consists of a sequence of base pairs (e.g., we have 3 billion base pairs). If a new mutation emerges at a particular base pair, a novel single nucelotide polymorphism (SNP), and, that allelic variant is ~10% fitter than the ancestral variant, natural selection could drive up its frequency (the conditionality is due to the fact that in all likelihood it would still go extinct because of the power of stochastic forces when a mutant is at low frequency). So the variant could in theory shift from ~0% (1 out of N, N being the number of individuals in a population, 2N if diploid, and so forth) to ~100%. This would be the fixation of the novel variant, driven by selective dynamics. So what’s the sweep aspect? The sweep in this case refers to the effect of the very rapid rise in frequency of the SNP in question on the adjacent genomic region. What is termed a genetic hitchiking dynamic results if the sweep occurs rapidly, so that nearby regions of the genome also move to fixation along with the favored SNP. But in a diploid organism with sexual reproduction genetic recombination persistently breaks apart associations across the physical genome. Therefore the span of the sequence of genetic markers nearby a favored SNP which form a haplotype is dependent on the rate of recombination as well as the rate of the rise in frequency of the allele, which is contingent on the strength of selection. A powerful selective sweep has the effect of homogenizing wide regions of the genome flanking the favored mutant; in other words the sweep “cleans” the gene pool of variation as one very long haplotype replaces many shorter haplotypes. As an example, in the genomes of Northern Europeans the locus LCT is characterized by a very long haplotype, which itself seems to correlate well with the trait of lactase persistence. The implication here is that the lactase persistence conferring variant arose relatively recently, and was swept up to near fixation by positive directional natural selection.

That’s the broad theory. But as you know, evolution and its subcomponents are more than “just a theory,” they’re a set of models which are amenable to testing, whether through observation, or via controlled laboratory experiments. A new letter to Nature elaborates how exactly selective sweeps play out in Drosophila melanogaster, a classic “model organism.” Interestingly, this is a case of experimental evolution, something we are more familiar with Richard Lenski’s E. coli. Genome-wide analysis of a long-term evolution experiment with Drosophila:

Experimental evolution systems allow the genomic study of adaptation, and so far this has been done primarily in asexual systems with small genomes, such as bacteria and yeast…Here we present whole-genome resequencing data from Drosophila melanogaster populations that have experienced over 600 generations of laboratory selection for accelerated development. Flies in these selected populations develop from egg to adult ~20% faster than flies of ancestral control populations, and have evolved a number of other correlated phenotypes. On the basis of 688,520 intermediate-frequency, high-quality single nucleotide polymorphisms, we identify several dozen genomic regions that show strong allele frequency differentiation between a pooled sample of five replicate populations selected for accelerated development and pooled controls. On the basis of resequencing data from a single replicate population with accelerated development, as well as single nucleotide polymorphism data from individual flies from each replicate population, we infer little allele frequency differentiation between replicate populations within a selection treatment. Signatures of selection are qualitatively different than what has been observed in asexual species; in our sexual populations, adaptation is not associated with ‘classic’ sweeps whereby newly arising, unconditionally advantageous mutations become fixed. More parsimonious explanations include ‘incomplete’ sweep models, in which mutations have not had enough time to fix, and ‘soft’ sweep models, in which selection acts on pre-existing, common genetic variants. We conclude that, at least for life history characters such as development time, unconditionally advantageous alleles rarely arise, are associated with small net fitness gains or cannot fix because selection coefficients change over time

Critical to understanding what’s going on here is the distinction they make between ‘classic’ ‘hard sweeps’ and ’soft sweeps.’ Hard sweeps follow the spare description I outlined above:

1) A new mutant arises in the genetic background

2) Selection favors the mutant

3) The mutant rises in frequency and sweeps to fixation, 0% → 100%, replacing the ancestral variants

In contrast, for a soft sweep:

1) Selection favors a set of minor polymorphisms already segregating in the gene pool

2) These polymorphisms rise in frequency

3) But they may not sweep to fixation

In the first case the signature of natural selection will be clear, distinct, and indubitable. A novel haplotype which has replaced the ancestral variants and produced a wide region of genetic homogeneity as all other allele states are expunged by the sweep will have resulted. That isn’t what they saw at the genomic level.

phendiffBut first, what did they do? The flies used in this experiment derive from a 30 year old lineage, and they selected them for 600 generations in the case of the treatments which were being driven to new phenotype values. 600 generations for humans would be about 15,000 years assuming 25 years per generation. If a trait is heritable, and you select offspring deviated away from the mean, over time you will see a shift in the trait value. This is classic quantitative genetics, and that’s what they saw. They had five lineages which exhibited accelerated development (ACO), and five which were controls which exhibited the ancestral phenotypes (CO). “Eclosion” refers to the fly’s emergence from the pupae. The lineages which were subject to natural had very different life histories from the control groups. The cluster of traits here shouldn’t be too surprising, we know from other taxa that short-lived fast-developing species tend to be smaller and metabolically more under-the-gun than the inverse.

But the real interesting aspects of this study are not the phenotypes. Who hasn’t seen weird things among the Drosophila? That’s one of the reasons they were chosen as model organisms in the first place! Rather, they explored the patterns of genomic variation within and across the lineages, and integrated the results into a broader theoretical framework of how evolutionary processes occur, and their implications for the genome-wide structure one should see. Below I’ve stitched together figure 2 & 3, which illustrate particular patterns of genomic variation.


The left figure shows differences in allele frequencies between the ACO and CO pooled lineages. The spikes indicate large differences, with the dotted line representing the threshold where there’s a 0.1% random chance of such a between population frequency difference. The vertical axis is log-scaled. The grey line at the bottom indicate the differences in one particular ACO lineage with the pooled ACO sample. In the right panel you see heterozygosities, with blue denoting the CO lineages, and red the selected ACO lineages which have shortened life histories. The grey again is a particular ACO lineage. Each vertical panel corresponds to a chromosomal arm of the the Drosophila melanogaster genome.

First, note the widespread distribution of allele frequency differences between ACO and CO. Additionally, there’s little difference between the specific ACO lineage, and the pooled sample. Despite their independent histories they seem to exhibit the same allelic configuration. Second, note that the heterozygosities in the case of the ACO pooled sample is lower than in the CO ancestral phenotype lineages. Why? Remember that selective sweeps should expunge genomic variation. But, the sweeps do not seem to have gone to fixation, otherwise we’d see many more inverted peaks converging to heterozygosity of ~0, as the selected variant replaces all others in the population.

What’s going on in the regions which exhibit differences between the controls and selected linages? They looked at the ~650 non-synonymous SNPs on ~500 genes which were most differentiated between ACO and CO (L10FET score > 4) and found the following categories of genes enriched: imaginal disc development, smoothened signalling pathway, larval development, wing disc development, larval development (sensu Amphibia), metamorphosis, organ morphogenesis, imaginal disc morphogenesis, organ development and regionalization. Life history is complex. Combine the wide class of genes with the dispersed genomic impact of selection as evident in figures 2 and 3, you get a good sense of the sort of consequences on the substrate level which quantitative genetic evolutionary dynamics have. Also of interest, they found that the X chromosome seemed enriched for signatures of selection and evolution. Why? They note that this chromosome would be more subject to selection for recessive or partially recessive expressing SNPs.

Clearly this study did not find the clean hard sweeps which theory may have predicted. Rather, the researchers found a lot of partially completed sweeps distributed all across the genome. Sound familiar? Before we move on to broader considerations, here are their explanations:

- The sweeps are hard, but haven’t reached fixation. So the selection coefficients have be rather small for them to still be in transient

- Selection is operating on “standing variation.” That is, the genetic variation extant naturally within a given population, and which may be operated upon by natural selection to change the population trait value mean through classical breeding techniques

- And finally, selection coefficients (the greater fitness of positively selected variants against the population mean) may not be static parameters, but change over time as a function of allele frequency. This shouldn’t be that surprising. Frequency dependence and epistasis can impact on linear assumptions within a statistical genetic model. The authors refer to deleterious alleles or antagonistic pleiotropy as possible genetic level forces which also prevent fixation

I personally lean against the first option, because it seems like we see a similar pattern in human evolutionary genomics, lots of partial sweeps and incomplete fixation. How much time does a brother need? In the long run we’re dead, and heat death swallows the universe. In the short run evolutionary pressures are always shifting. Fix now, or forget it say I! The wide distribution of allelic differences as well as moderate heterozygosities seems to be an indication that a quantitative trait, life history, is being modified through mass action on genetic variation. Interestingly, there’s also the parallel to humans insofar as the X chromosome seems to have more signatures of selection and variation in this evolutionary experiment. Next question: who’s working on experimental evolution of 600 generations in mice?

Citation: Burke, Molly K., Dunham, Joseph P., Shahrestani, Parvin, Thornton, Kevin R., Rose, Michael R., & Long, Anthony D. (2010). Genome-wide analysis of a long-term evolution experiment with Drosophila Nature : 10.1038/nature09352

Image Credit: Karl Magnacca

September 8, 2010

Sexual selection: lowered expectations edition

800px-Pfau_imponierendSexual selection is, for lack of a better term, a sexy concept. Charles Darwin elaborated on the specific phenomenon of sexual selection in The Descent of Man, and Selection in Relation to Sex. In The Third Chimpanzee Jared Diamond endorsed Darwin’s thesis that sexual selection could explain the origin of human races, as each isolated population extended their own particular aesthetic preferences. More recently the evolutionary psychologist Geoffrey Miller put forward an entertaining, if speculative, battery of arguments in The Mating Mind: How Sexual Choice Shaped the Evolution of Human Nature. It’s clearly the stuff of science that can sell.

Sexual selection itself comes in a variety of flavors. Perhaps the most counterintuitive one on first blush is the idea that many traits, such as antlers, are positively costly and exist only to signal robust health which can incur the cost without debility. The idea was outlined by Amotz Zahavi in The Handicap Principle in the 1970s. Initially dismissed by Richard Dawkins in the original edition of The Selfish Gene, Zahavi’s ideas have come into modest mainstream acceptance, and the second edition of Dawkins’ seminal work reflects a revised appraisal. This is really a subset of a “good genes” model of sexual selection, whereby females select from a range of males which would exhibit variance in mutational load. A more capricious and erratic form of sexual selection is “runaway,” which like genetic drift needs no rhyme or reason. Rather, arbitrary initial preferences can become coupled with heritable preference in a positive feedback loop which drives the mean phenotypic value of a population off the previous median, until natural selection enforces a countervailing pressure once the trait starts to become excessively maladaptive (e.g., imagine selection for longer and longer tail feathers until the ability of a bird to fly is inhibited).

ResearchBlogging.orgPaul_Giamatti_2008But notwithstanding the inevitable press which the theory gets, and its centrality to several popular science books, the main action in the area of sexual selection is in the academic literature (contrast this with the aquatic ape hypothesis). Many of the verbal outlines of sexual selection are highly stylized, as economists might say. We are treated to images of stags with massive antlers facing off, elephant seals strutting their stuff, and beautifully plumaged birds gathering for a lek. Set next to this is a body of mathematically oriented models, short on color, long on Greek symbols.  But these formal models are valuable. Obviously there is a wide range of variation across species in terms of how sexual selection plays out (if it does so at all within a given species, sexual or asexual). The sexual dimorphism of elephant seals is not the norm against which all species are judged. To explore the variables which produce this pattern of difference one must analyze them in an algebraic fashion, where each can be manipulated in isolation so as to properly characterize its impact. So with that, a paper from The American Naturalist which purports to show how assortative mating could emerge in a sexual selective framework, Make love not war: when should less competitive males choose low-quality but defendable females?:

Male choosiness for mates is an underexplored mechanism of sexual selection. A few theoretical studies suggest that males may exhibit—but only under rare circumstances—a reversed male mate choice (RMMC; i.e., highly competitive males focus on the most fecund females, while the low‐quality males exclusively pair with less fecund mates to avoid being outcompeted by stronger rivals). Here we propose a new model to explore RMMC by relaxing some of the restrictive assumptions of the previous models and by considering an extended range of factors known to alter the strength of sexual selection (males’ investment in reproduction, difference of quality between females, operational sex ratio). Unexpectedly, we found that males exhibited a reversed mate choice under a wide range of circumstances. RMMC mostly occurs when the female encounter rate is high and males devote much of their time to breeding. This condition‐dependent strategy occurs even if there is no risk of injury during the male‐male contest or when the difference in quality between females is small. RMMC should thus be a widespread yet underestimated component of sexual selection and should largely contribute to the assortative pairing patterns observed in numerous taxa.

The title is accessible and charming, but the paper is dense on mathematical formula and computational esoterica. It screams “trust me with my parameters!” But reality is a complex and manifold thing, and it may be that to model it one must go beyond elegant simplicity. As noted in the above abstract sexual selection models are often spare. That’s the beauty of a model, you remove all you can from the reconstruction of reality until you start losing the aspects of reality which you’re trying to understand and predict. I am not totally familiar with the sexual selection literature, so the first table is helpful insofar as it gives a sense of the scope of previous models which this paper is an extension of, and to some extent rejoinder to.


The main parameters to focus on in this study are the quality of the males and females, the competition between males, and the cost of mating. All the parameters checked off for the current study relate to these broad classes; density for example would increase competition, as would shifting the sex ratio. This being a model of the “mating game” rather than all the phenomena which might occur in the life history of individuals in a species, it is constrained in a somewhat peculiar manner. Males have a specific finite lifetime, and can enter into a serial set of relationships. These relationships are of finite length naturally, and, a particular fraction of the lifetime of a given male, though that fraction may vary within the model. Additionally, males have to engage in “pre-copulatory guarding” before gaining a reproductive payoff. Basically, the male can not mate for a period of time after pairing up with a female. During this guarding period the male may have to fend off suitors, so there is a risk that the investment is all for naught. This is the dimension where the quality of both male and female come into play. For example, low quality males are not good defenders, and high quality females will attract a lot of attention. There are also factors such as predation risk while seeking a partner, which one must do if one loses one’s current partner to a superior male, or, one is initially unpaired and is deciding whether to reject to accept the offers of pairing up with a female.

Frankly, the model outlined in the paper is convoluted, and it probably says something that they have to nest a lot of the details into the supplements. Table 2 has all the parameters of interest.


As you can see some of the parameters have a few discrete values. Some of these are obviously continuous variables in reality, but for the purposes of modeling you have to simplify, especially if you’re going to do something computationally intensive. They ran the “game” of interactions over several different variations of the parameters, and noted how males varied in their evolutionarily stable strategy. Below are three figures which illustrate the response topographies of males of high and low quality to females of high and low quality, with number of interactions on the y-axis (the axis projecting “away” from your viewpoint perspective), and “rejection index” on the z-axis (vertical). High quality males are in the top panels, low quality males in the bottom panels, high quality females in the left panels, and finally, low quality females in the right panels. Each figure has a different parameter varied on the x-axis, as per the labels.

The rejection index is such that below 0 denotes acceptance and above rejection. In the first figure the variable is the time invested in each reproductive event, ranging from 1% to 50% of the male’s lifetime. In this situation high quality males accept high quality females, and reject low quality females, invariably. But low quality males are more accepting of low quality females as the time invested increases, and tend to reject high quality females. Why? High quality females would likely attract attention from high quality males, against whom the low quality males could not compete successfully. In the mating game pairing up with a high quality female would be a low payoff action, as the probability of keeping such a female and reproducing is low. The logic is inverted for low quality females, who would attract less attention from other males. Granted, these females are less fecund, but low fecundity is better than no fecundity from the perspective of the low quality male.

The second figure varies fecundity ratio between the high and low quality females, from 5% to 100%. In the second case there’s no difference in fecundity between the two classes, and that explains panel B, where the high quality males drop sharply into acceptance territory for low quality females as the x-axis verges to 100%.  For low quality males the picture is different, as they begin to reject much more quickly once the ratio difference starts to converge. Observe however the effect of the y-axis, number of female interactions assuming one is not guarding a mate. As the number of these interactions increases the rejection threshold keeps dropping as low quality males become less and less inclined to guard high quality males. This has to be because the greater the number of interactions which freelance males have, presumably the greater the number of competitive interactions whereby these males may “steal” a female from a male who is guarding one.

Finally, the last set of figures focuses on “operational sex ratio,” OSR. The OSR ranges from 0.2, female-biased, to 2.4, male-based. When there is a deficit of females high quality males will begin to accept pairings with low quality females, as is clear in panel B of the third figure. This makes rational sense in an environment of “scarcity.” The behavior of low quality males is more peculiar. In a situation of extreme female surplus their behavior converges upon that of high quality males: they reject low quality females, and accept high quality ones. As the sex ratio verges toward 1 the low quality males begin to reject high quality females and accept low quality ones. It seems that balanced mating ratios result in optimal trait matching, at least in terms of genetic quality, in the context of male competition for females (i.e., low quality males may prefer high quality females, but that is not an optimal decision because the likelihood of a payoff is low). But as the sex ratio verges toward a male surplus there are no good options for low quality males; the high quality females will reject them, because there are high quality males galore for them to select from, and the low quality females are now acceptable to high quality males, who will win them in the competition with low quality males.

Much of this is common sense. The mapping between formal quantitative model and verbal description is rather good. We know intuitively that in a context of male surplus it is the low quality males who will be shafted, and that low quality females will become valuable. You can offer up anecdote from engineering universities, or the army, or cite historical examples such as frontier societies with male-biased sex ratios. In modern day Punjab men import wives from poorer regions of eastern South Asia because of a sex-ratio imbalance. But here is where numbers are of the essence, as quantitative models show you how shifting the variates shifts the response. There has been some concern in relation to “bare branches”, men who can not marry in Asia, and its possible impact on societal stability. But one must keep in mind the exact proportion of bare branches within a society when predicting instability due to manic competition for women. Formal models can give us a better guide as to thresholds which should concern us.

Ultimately papers like this need to be validated by experiment and observation. But they’re useful toolkits, sharpeners of thought and conceptualization. It’s hard to test, verify, and refute, if you don’t pose the question and make a prediction in a clear and distinct manner.

Citation: Venner S, Bernstein C, Dray S, & Bel-Venner MC (2010). Make love not war: when should less competitive males choose low-quality but defendable females? The American naturalist, 175 (6), 650-61 PMID: 20415532

Image Credit: BS Thurner Hof, Kristin Dos Santos

Powered by WordPress