Razib Khan One-stop-shopping for all of my content

September 12, 2018

The genetics of Afrikaners (again)

Filed under: Afrikaner genetics,Historical Population Genetics — Razib Khan @ 10:18 pm
Click to enlarge


I personally get asked about the genetics of Afrikaners, because I’ve written about/analyzed the issue before. The main outlines seem to be established, but I thought I might go and revisit it again. The main reason is that we have ancient South African DNA, and I’ve been adding it to my personal analyses for a while. It might be worthwhile to reanalyze the South Africa samples I do have with some of these added in.

The plot at the top shows the core populations I started with. I did some outlier pruning. I only kept the South African samples that were overwhelmingly white. I picked Malays and a South Indian population because of Cape Coloureds, a mixed-race Afrikaans speaking group which has Asian ancestry that can be attributed to both South and Southeast Asian populations (the Dutch imported many slaves from India and had outposts in Java). I also used Bantu samples from South Africa, Kenya, as well as a Nigeria population. Finally, I also had some Hadza as a different hunter-gatherer population than the San Bushmen. For Europeans, I used white Dutch.

The final marker density as 200,000 SNPs, so not too bad.

As you can see if you click on the image all of the South African whites were shifted away from the Dutch. There were two outlier individuals, one of which was closer to the Dutch cluster, and one further. All the other individuals form a neat cluster. None of these individuals were close relatives.

Click to enlarge

I ran Treemix on the data with multiple migrations until the migrations stopped making sense to me. The African populations’ exhibit migration flows to each other. Much of it is entirely comprehensible. The Esan receive no migration, highlighting that this population did not receive gene flow from any groups in these data. The Kenya Bantus receive gene flow from the direction of Eurasians. This is also certainly Nilotic mediated. The gene flow they receive from the base of the ancient San is more enigmatic, but probably reflects uptake of local ancestry as the Bantus expanded. The southern Bantus receive gene flow from modern San.

The South African whites receive gene flow from a position on the graph between the modern San and other non-San African groups.

Click to enlarge

Next, I ran Admixture in the unsupervised mode with K = 6. The two populations mostly light-blue are South African whites and the Dutch, from the top to the bottom. You can see though that the South African whites clearly have other ancestral components. Most of these individuals have the components modal in the San, Esan Nigerians, Indians, and Malays. The two outlier individuals are also clear. The individual very close to the Dutch, but shifted toward the Asians, in the PCA does not have any African admixture. The individual shifted more toward the non-Europeans in the PCA also has more non-European fractions of ancestral components (that is, those components modal in non-European populations).

Next, I decided to confirm things by running a three population test. If you read this blog you’ve seen this before. Basically this is measuring shared ancestry by looking at deviations from a particular phylogenetic model: (test population(pop 1, pop2)). The relatedness of the test population to either pop1 or pop2 (that is, it’s a mix of the two) is measured by the negative f3 statistic, and I focused on z-scores greater than two.

Here they are:



No surprises so far. One thing that did surprise me though was the extent of the admixture even after PCA outlier removal. So I took the output you saw above and removed individuals that were very mixed, except for the case of the white South Africans. Then, I ran admixture in supervised mode, where the “pure” populations were fixed as references (I merged the moden San without much admixture with the ancient San). You can see the results below:

Click to enlarge

Re-running the three population test with these “pure” populations I only got significant results for the below cases:


No big surprise.

The average European ancestry I got in my South African white samples, N = 12, is 93.5%. Making a composition individual, note that if someone had great-great-grandparents who were not European, they would be expected to have 6.25% non-European ancestry. That’s 4 generations back. So about 100 years. These individuals are presumably adults. Let’s say they are 25 years old. That goes back 125 years. It’s probably reasonable in a single person admixture people to suggest it was sometime in the mid to late 19th century.

This seems unlikely. The evenness of admixture and balance between different groups indicates that it is older than that, and they are obtaining it from different lineages. Traditional genealogical estimates suggested in the range of 5-7.5% non-European ancestry in Afrikaners, and one study of 185 individuals showed 18% non-European mtDNA.

I will probably some ancestry deconvolution and see if I can get a figure for the time of admixture (though the fractions here are very small, as is the sample size of the admixtured population). But the non-European ancestry of Afrikaners is uncannily similar to the non-European ancestry of the Cape Coloureds. That to me leads us to the conclusion that in the early European settler community a fair number of mixed-race women married in. Those mixed-race women who married mixed-race men helped found the Cape Coloureds.

Powered by WordPress