March 6, 2020

A North Indian in Uzbekistan at 1550 B.C.

I was rereading the supplements for Narasimhan et. al. for the purposes of trying to adduce the best model to calculate “steppe” proportions in Iranians (someone asked I do this). In the process, I noticed this passage again:

Third, we find that one of the outliers, Bustan_BA_o2, is consistent with being admixed between an individual related to people on the Indus Periphery Cline and Middle to Late Bronze Age Steppe pastoralists, a type of admixture event we also observe in the Late Bronze-Iron Age Swat Valley that we will examine later, suggesting that the admixture events that led to the formation of the SPGT in Pakistan also occurred between outlier individuals at the BMAC and Steppe pastoralists who arrived at the end of the 2nd millennium.

Here is some detail on the site of the sample: UZ-BST-015, Site 4, Grave 4, 57-27 (I11520): Date of 1613-1509 calBCE (3280±20 BP, PSUAMS-4605). The earliest date possible on the Swat samples is 1200 BC (though 1100 BC is more likely). That means that this outlier individual is the earliest example of the genetic mix that would come to characterize much of northern India. A mix of steppe, and Iranian-farmer-related, and Ancient Ancestral South Indian (AASI).

The text of the supplements seems to imply that this individual is sui generis, a mix of Indus Periphery and steppe, which prefigures what was to come later in South Asia. But I will offer another hypothesis: this individual is a migrant, or the child of migrants, from the earliest phase of the ethnogenesis of the Indo-Aryan matrix of Northwest India.

March 5, 2020

The Facts About Elizabeth Warren’s DNA test

With Warren dropping out of the race for the Democratic nomination a lot of people on podcasts I listen too are making fun of her DNA test. Unfortunately, there are some falsehoods being promoted. It’s kind of scary for me because this is a field I know well, and it’s disturbing to watch falsehoods becoming accepted truths because people repeat them over and over again.

– First, the DNA test was not done through 23andMe, etc., or any standard commercial service. Rather, it was done by the Bustamante group at Stanford. This group has a lot of experience with the genetics of indigenous peoples of the Americas, so that is presumably why they were approached.

– Second, Warren surely has more than the expected amount of ancestry derived from people who were resident in the New World prior to 1492. The Bustamante group used relatively stringent criteria that are not comparable in an apples-to-apples manner with the inferences of 23andMe.

I am not here addressing the issue of whether she is or isn’t a Cherokee, or descended from Cherokees. The tests can’t answer those questions for both scientific and socio-political reasons. I’m also not addressing whether she used her identification with that tribe in furthering her career.

My only point in putting this post up is that it gets really disturbing to see “pundits” repeating “facts” you know are totally wrong without any malice because the information ecosystem is such that false facts rapidly transmit into conventional wisdom.

Here is an old post, Elizabeth Warren Carries Native American DNA – She’s Running!.

Note: I have a piece about personal genomics that should be in the print edition of National Review in early April.

Dime-Store Genomics 

Genetic testing will soon be cheap, routine, and ubiquitous.

February 15, 2020

Why physical appearance is an imperfect individual proxy for ancestry

Kalash children

Pictured above are some Kalash children. You notice in the foreground and center a child who could easily pass as European and draw no notice on the streets of Gdansk, Poland. But look at the child right behind her, I would guess she’d draw no notice on the streets of New Delhi!

Though the Kalash are noted for their fair features, most of them look more West Asian than anything else, and from what I can tell as many have a “northwest Indian” phenotype as a “European” one. Genetically we know that they are good proxies for “Ancestral North Indians” (ANI). About ~30% of their ancestry can be modeled as derive from the steppe peoples, such as the Sintashta. Indo-Aryans. The other ~70% of their ancestry is similar to that of the Indus Valley Civilization (IVC) people, which itself can be decomposed as mostly ancient Southwest Eurasian-adjacent (i.e., derived after the Last Glacial Maximum from the ancestors of Zagros farmers) and a minority of ancestry that is more like that of Andaman Island and pre-Neolithic Southeast Asians (“Ancient Ancestral South Indians,” or AASI).

Another thing to note about the Kalash is that they are genetically very homogeneous. This is due to the fact that they live in an isolated region, and their non-Muslim religion means that they have not intermarried with nearby Muslim people. What does this imply? It means that the Indian-looking girl is exactly the same ancestrally as the European-looking girl. Both have the same proportion of AASI and Indo-Aryan ancestry. That being said, the Indian-looking girl exhibits features more like that the AASI than the European-looking girl. Why?

The simple reason is that the genes which vary and encode salient physical features are a much smaller subset than the total genome. Therefore, they are subject to much higher variance from individual to individual (lower N in the denominator).

Here’s a concrete example. Compare eye color to inferring total ancestry and your total ancestry. Modern SNP-array ancestry inference relies on 100,000 to 1 million genomic positions. It is pretty good as a proxy for the 10 to 100 million SNPs out of your 3 billion base pairs that define your variable ancestry. For eye color, there are a few dozen genes at most, and more honestly a handful that really impacts variation. For Europeans, 75% of the variation of blue vs. non-blue eye color is due to variation around one genetic region, the HERC2-OCA2 locus. This means that just because someone has blue eyes, one can’t be sure that one has much European ancestry at all!

In the 1000 Genomes South Asian populations the SNPs for “blue eyes” are 2 to 10% frequency by population. Since the expression is recessive (you need both copies of the “blue eye” variant), assuming just this SNP you’d expect 0.05% to 1% manifestation of the characteristic in Indian-origin populations. The people with blue eyes have no more or less European ancestry than anyone else in their family.

Where does this leave us? You should understand from this that within a given family or ethnic group there is going to be a range of appearances, and a range is normal within many groups without exotic ancestry. Most Bengalis have 5-20% East Asian ancestry (closer to 5 in West Bengali, closer to 20 in Comilla and Chittagong). This means most of their ancestry is South Asian, and most Bengalis look just like other Indian-origin people. But a substantial minority look somewhat East Asian, to varying degrees. This is exactly what you expect when you have a minority quantum of ancestry.

Finally, many of the commenters here made a lot of assumptions about vloggers talking about their ancestry and were quite rude. I wish you wouldn’t do that. As a matter of fact, many of the inferences may actually be correct, but you don’t know for sure, and you don’t know the whole story. I’m pretty liberal on the comments of this weblog, but if you exhibit a serial pattern of rudeness I’m going to start randomly deleting your comments (if you complain about this I will immediately ban your IP).

February 14, 2020

Most Bangladeshis are 10% to 20% East Asian

I wish consumer genetic tests did a better job of communicating the madness to the methods. The vlogger above is a bit confused because one of her grandmothers looks rather East Asian, but her DNA results clearly indicate her Bengali ancestry. What the Ancestry DNA test does not make clear is that Bengali ancestry includes within it 10-20% East Asian ancestry.

