In the comments below I made the comment that the Parsi people of India, who reputedly arrived in India ~1000 years ago from Iran, are about 25 percent South Asian. By this, I mean that their ancestry is about 75 percent Iranian (presumably Persian), with 25 percent admixture from South Asian populations amongst whom they lived. But my feeling about this was vague, and I decided to check the scientific literature. Unfortunately there hasn’t been a lot of work done in this area with cutting edge genomics. But a cursory examination shows that there’s been substantial migration of Indian women into the Parsi lineage via the mtDNA. In the figure to the right you see that “PA”, the Parsis, have a lot of “South Asian” mtDNA lineages compared to the Iranian groups. This mostly consists of South Asian branches of haplogroup M. It jumps out to you immediately when looking at the haplotypes that the Parsis carry on their mtDNA. I found less on the Y chromosomes, which are less informative in differentiated South Asians from Iranians in any case (the mtDNA difference is much greater between these two regions), but what I did find is that Parsis can be modeled as 100% Iranian on their paternal lineages. This is probably an exaggeration, but as a stylized fact I think it gets to the heart of the matter.
But what would really be useful are autosomal results. Those were hard to find. Noah Rosenberg’s 2006 paper on Indian genetic differentiation using microsatellites did have a Parsi sample. If you look at the results the Parsi do seem South Asian, roughly equivalent to Pathans, an Iranian speaking group in Pakistan which has strong South Asian affinities. But the sample set does not include any Iranian groups from Iran proper, but rather Middle Eastern groups from the Arab world or the Caucasus. Without such a reference population it is hard to gauge Parsi relatedness.
There was one last hope. Harappa DNA has been collecting results for many years now, and I was hoping that there was a Parsi in the sample. There was, just one. I took the Parsi and compared this individual to various Iranian and a few select Indian groups. Here are the admixture results (edited to show only the relevant ancestral clusters):
|Kurd Zaza Turkey||2||23||43||6||6||13|
|Kurd Kurmanji Iraq||4||21||41||4||7||15|
|Kurd from Turkey||4||24||41||4||8||12|
|Kurd Yezidi Iraq||4||26||39||4||7||13|
|Kurd Kurmanji Iraq||5||24||39||4||8||13|
|Gujarati Patel Muslim||34||32||13||3||3||6|
|Gujarati Sunni Vohra Surti||35||34||13||5||2||4|
|Gujarati Vaishnav Vania||45||36||4||4||1||3|
The key is to focus on the “South Indian” ancestry. Though this is found in some Iranian groups, it drops off very rapidly once you move past groups like the Pathans. The Parsi individual has 16 percent South Indian ancestral component. Looking at the Iranian individuals, you can probably say that you might expect 5 percent from this population. The question is what is the Indian source population? There’s a lot of variation among these. But, if you take 50 percent South Indian for the South Asian source population, then you get:
(50 percent)*(0.75) + (5 percent)*(0.25) = 16.25%
So at least going by this one individual something like ~25 percent is probably correct for the Parsis in terms of how much “native” South Asian ancestry they’ve picked up. Since they are genetically quite homogeneous at this point an N = 1 might be sufficient to reach a conclusion. I’d be curious if anyone finds anything different.