June 18, 2017

The Finnic peoples emerged in Baltic after the Bronze Age

Filed under: Finland,History

A reader in the comments reminds me there has been a preprint which is relevant to the population structure of Baltic Europe which came out a few months ago, Extensive farming in Estonia started through a sex-biased migration from the Steppe:

…Here we present the analyses of low coverage whole genome sequence data from five hunter-gatherers and five farmers of Estonia dated to 4,500 to 6,300 years before present. We find evidence of significant differences between the two groups in the composition of autosomal as well as mtDNA, X and Y chromosome ancestries. We find that Estonian hunter-gatherers of Comb Ceramic Culture are closest to Eastern hunter-gatherers. The Estonian first farmers of Corded Ware Culture show high similarity in their autosomes with Steppe Belt Late Neolithic/Bronze Age individuals, Caucasus hunter-gatherers and Iranian farmers while their X chromosomes are most closely related with the European Early Farmers of Anatolian descent…

As you can see in the PCA plot above the Comb Ceramic Culture and the Corded Ware culture in Estonia are modeled well by the three ancestral populations hypothesis for Europe. The problem with this is that Finns and Russians with Finnic background do not fit with this model. There has been clear later gene flow.

From the text:

Interestingly, modern Estonians showed a bigger proportion of the blue component [associated with European hunter-gatherers] than CWC individuals. Comparing to CCC individuals, modern Estonians lack the red component [Eastern Siberian]. This, together with the absence of Y chromosome hg N in CCC and CWC, points to further influx and change of genetic material after the arrival of CWC.

The sample sizes are small. Additionally these are from Estonia, not Finland. But the Comb Ceramic Culture was widespread throughout the region.

Also, from a 2015 paper (supplements):

Among the northern Europeans, the Finnish (finni3) show evidence of an admixture event involving a minority source most similar to contemporary North Siberians (469CE (213BCE-1011CE)). Finns are thought to have originated from the northward migration, and subsequent contact, between Central Europeans and indigenous Scandinavian hunter-gatherers closely related to the Saami [S33]. The Saami are closely related to the individuals that make up the North Siberian world region, and whilst our confidence in this admixture date is low because of the small size of the cluster, the event we see is likely to represent this key period in Finnish history.

The “North Siberia” cluster are: Selkup, Chukchi, Dolgan, Ket, Koryak, Nganassan, Yakut and Yukagir. The admixture is very recent. I suspect too recent. But it gets us to the qualitative point that the Siberian admixture into Finns is probably not that old.

June 17, 2017

The origin of the Finnic peoples

Filed under: Finland,History


One of the very first things I wrote about in relation to historical population genetics was in on the origins of the Finnic peoples. The reasons are two fold:

– first, the Finns and Estonians speak language is rather peculiar in a Europe dominated by Indo-European tongues (I suspect one reason that Tolkien based Quenya, the high elvish language, on Finnish is that it is so otherworldy to the Germanic ear. The Sindarin language, which was the common tongue of elves in Middle Earth, was based on Welsh). Rather, the distribution to the Uralic languages extends to the east, as far as Siberia. Even the closest affinities to Finnish and Estonian extend eastward, as there are Karelians who live deep in northwest Russia.

– second, there were peculiarities in the genetics of the Finns which date back to the 20th century that have always been notable.

Some of the distinctiveness of the Finns clearly has to do with the demographic isolation of the recent past, and the range expansion into the north and east. I will ignore this aspect of recent drift, and focus on their deep history and phylogenetic relationships.

New molecular genetic techniques in the 1980s and 1990s which enabled the genotyping of Y and mtDNA lineages immediately yielded the fact that the paternal heritage of the Finns is very unique in comparison to their neighbors, and erstwhile hegemons, the Scandinavians. While Swedes tend to be haplogroup I (indigenous to Western Europe dating to the late Pleistocene) or one of the two R1 lineages (intrusive from the Eurasian steppe during the Bronze Age), Finns tend to be haplogroup N3, with a substantial minority of I. While 63 percent of Finns are N3, only 3 percent of Swedens are. Due through the reality of migration of Finns to Sweden, as well as the prevalence of Saami all across Northern Sweden until the early modern period, Swedish N3 may be due to gene flow in the last thousand years. The two R1 lineages are ~10% of the Finnish paternal gene pool, they’re strongly skewed toward R1a, while the ~40% of Swedish R1 lineages are balanced.

In contrast the mtDNA profiles of Finns are very similar to their neighbors. Like Sweden the dominant haplogroup is a branch of H, with the reduced fraction accountable for the fact that Finns have a higher percentage of U5, which has been associated with European hunter-gatherers. The various haplogroups (e.g., T) associated with Early European Farmers are at somewhat lower frequency in Finland than Sweden.

A simple explanation then presents itself to us: the Finns have been subjected to male mediated admixture into a “conventional” European substrate. But there has been long been controversy as to whether the Finnish N3 haplogroup was indigenous to Europe, or its presence in Northeast Europe was due to migration. If it was indigenous than the admixture model does not make as much sense. But as with many things we’ve moved very far in comparison to where we were when I first began to look at this issue in 2002.

If you read Human Y Chromosome Haplogroup N: A Non-trivial Time-Resolved Phylogeography that Cuts across Language Families the likelihood than the Y chromsomal structure of Finland is old seems low. First, Finnish N3 lineages are very young and underwent rapid expansion beginning 4 to 6 thousand years ago (this is evident in their whole genome variation pattern). Second, the most diversity of N seems to be in Western Siberia. Third, N exists in higher frequencies in parts of Siberia than even in Finland. Fourth, the range of N pushes it all the way to the Pacific Ocean. It is not implausible that it expanded from one rim of Eurasia to the other, but the most likely scenario is that it came from somewhere in the middle.

Also, it is likely that there has been admixture into Finns from an East Eurasian population. To give some examples, a derived SNP at EDAR is at very high frequency in Northeast Asians. The ancestral variant is dominant outside of East Asia and the New World. In Europe among modern Europeans the derived variant of EDAR is not present in indigenous populations. A quick check in the 10000 Genomes data shows that it’s at ~6% in Finns (in contrast, the ancestral variant of SLC24A5 is present at frequencies of ~1; this could be random, but I suspect in situ selection….). You can see that the derived variant is absent in a rather large sampling of other Europeans.

Running ADMIXTURE unsupervised it’s immediately obvious that Finnic peoples have a minority component of East Eurasian admixture. This dark blue element is absent in most of the Swedes. Not surprisingly the Russians exhibit structure depending on where you sample. Some Russian populations are clearly Slavicized relatively recently, and exhibit a genetic profile rather like Finnic peoples (this northern Russian regions also have high frequencies of haplogroup N, which is much rarer in the south or among Ukrainians).

There’s a cline that runs east to west in relation to this component. The Finn’s neighbors immediately to the east, Karelians and Veps, have a higher fraction than the Finns proper. Additionally, some Finns in the data seem to lack it totally. One might speculate that these are people of Swedish origin who eventually assimilated to the Finnish identity. This is not impossible. In the 19th century Finnish nationalism was sparked in large part by middle class activists, many of whom were Swedish ethno-linguistically due to the connections between class and language at that time. But these individuals may be evidence of older structure in Finland. More on that later.

I also ran some Treemix on a subset of the data. You see there is gene flow coming into the Finns from a Siberian group. I used Nenets (a group of Samoyeds) and Yakut because the former have more linguistically in common with the Finns, while the latter are used by companies like 23andMe (Yakuts are the most northeasterly Turkic people). Strangely the Karelians and Veps get gene flow from Nenets, while the Finns get it from Yakuts (I pruned with PCA and ADMIXTURE to remove individuals with recent European ancestry).

But the model of a single pulse admixture is probably wrong anyhow. Rather, the spread of Finnic hunters and gatherers may have gradual, and/or occurred in several pulses. On the fringe of Northern Eurasia local extinctions were probably common. The landscape of Northern Eurasia, from the Baltic to Siberia, may long have been rather dynamic, with interactions between Uralic, Indo-European and Altaic peoples.

At this point I am at a loss. The archaeology of Finland is not something I know well, and the academic literature is hard for me to track down. Some scholars believe that the Comb Ceramic Culture plays a major role in the ethnogenesis of the people we call Finns. During the Bronze Age the Corded Ware zone spread into southern Finland, bringing agriculture. The fusion between the Comb Ceramic and Corded Ware led up to the societies which are first mentioned by Classical authors.

Finland was always liminal to early agriculture, and the Corded Ware Indo-Europeans may eventually have given away to the forest Finns as the climate turned more difficult. The predominance of N3 haplogroups may be a function of the nature of patriarchal societies, where certain lineages maintain powerful long term advantages.

December 8, 2010

The men of the north: the Sami

Ole Magga, Norwegian politician

ResearchBlogging.orgOn this blog I regularly get questions about the Sami (Lapp*). That’s because I often talk about Finnish genetics, have readers such as Clark who are of part-Sami origin, and, the provenance and character of the Sami speak to broader questions about the emergence of the modern European gene pool. More precisely questions about the Sami are relevant to the broader nature of the Finnic presence in Europe, and their relationship to other Baltic and northern populations. Are these people “indigenous” to Europe, or relatively newcomers (prehistoric Magyar or Turks).? These questions are prompted by the peculiarity of their languages (as well as the physical appearance of some of the Sami). With Basque they are the only living non-Indo-European European languages whose origins are prehistoric (Magyar and Turkish were arrivals within the last 1,000 years).**

Because of affinities to other Uralic languages which are found in Central Siberia it has often been conjectured that the Finns, Sami, and Estonians are relative newcomers to Norden from that region. This has some equivocal support from Y chromosomal lineages. On the other hand, there are those who argue that the Finnic peoples were present in the north of Europe before the arrival of Indo-European speakers (often these are Finnish nationalists). This has some support from maternal lineages. Naturally, some have been tempted to synthesize these two genetic lines of evidence, and the linguistic affinities, to argue that Finns are a hybrid population of Asiatic men and Paleolithic European women! But we need to go further than uniparental markers, the direct male and female ancestral lines. We need to look across the broader swath of the genome. It just happens that a new paper was published in The European Journal of Human Genetics on autosomal Sami affinities to other populations, A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies:

The understanding of patterns of genetic variation within and among human populations is a prerequisite for successful genetic association mapping studies of complex diseases and traits. Some populations are more favorable for association mapping studies than others. The Saami from northern Scandinavia and the Kola Peninsula represent a population isolate that, among European populations, has been less extensively sampled, despite some early interest for association mapping studies. In this paper, we report the results of a first genome-wide SNP-based study of genetic population structure in the Finnish Saami. Using data from the HapMap and the human genome diversity project (HGDP-CEPH) and recently developed statistical methods, we studied individual genetic ancestry. We quantified genetic differentiation between the Saami population and the HGDP-CEPH populations by calculating pair-wise FST statistics and by characterizing identity-by-state sharing for pair-wise population comparisons. This study affirms an east Asian contribution to the predominantly European-derived Saami gene pool. Using model-based individual ancestry analysis, the median estimated percentage of the genome with east Asian ancestry was 6% (first and third quartiles: 5 and 8%, respectively). We found that genetic similarity between population pairs roughly correlated with geographic distance. Among the European HGDP-CEPH populations, FST was smallest for the comparison with the Russians (FST=0.0098), and estimates for the other population comparisons ranged from 0.0129 to 0.0263. Our analysis also revealed fine-scale substructure within the Finnish Saami and warns against the confounding effects of both hidden population structure and undocumented relatedness in genetic association studies of isolated populations.

They had 352 Sami samples, and looked at ~38,000 SNPs. For the questions they’re focusing on 38 K SNPs seems fine. That’s enough to smoke out inter-population variation. In their paper they compared the Sami to the HGDP populations using standard techniques. Assuming 7 ancestral populations in the data set, this what ADMIXTURE popped out:


There is a definite “eastern” affinity among the Sami. Interestingly, it is broken down into a major and minor component. The major one is what is found among the Han, while the minor one resembles Native Americans. The natural interpretation for this is that what one is seeing is the shadow of the circumpolar northern Eurasian populations which spanned eastern Europe to Siberia. In comparison with other European populations the Sami affinity with Russians is clear, though interestingly they lack the “blue” component which peaks in northwest South Asian populations, which the Russians have, and Sardinians and French Basque lack.

samieigenTo the left you see a PCA which breaks out the top two components of genetic variation for the data set. The two axes seem to be roughly west-east, north-south. Whatever ancient affinities the Sami may have with Southern Europeans via mtDNA haplogroup U5, it is not evident in the total genome content. The position of the Sami between Russians and Orcadians (from north of Scotland) is probably attributable to the fact that the Sami share much genetically with other Scandinavians, who are closer to British populations than the Russians are.

I’m not sure these analyses really shed any light on the on the questions I mentioned earlier. The authors themselves note that the “eastern” component of the ancestry in the Sami is probably very old, so they may be an ancient stabilized hybrid population, mostly indigenous with a non-trivial exogenous element. That does not tell us whether Finnic languages are indigenous to Europe, or whether they are indigenous to Central Siberia (indigenous here is in reference to the Indo-European languages). Additionally, there is the matter that for such fine-grained questions the HGDP sample is suboptimal as reference populations. Dienekes Pontikos points this out:

It is unfortunate that they included Native American HGDP populations, but did not include the most relevant published data on Siberians that I first used to study population structure across north Eurasia here and here and here.

Hence, they discover a “Native American”-like component in Saami, which in all likelihood can be further resolved into Siberian-specific components utilizing the Rasmussen et al. dataset.

The “closest approximation” to the East Eurasian component in Saami in the HGDP panel are the Yakuts, but finer-scale analysis (see my previous posts) reveals that the Yakuts are made up almost entirely of an Altaic-specific component tying them to Turkic, Mongol, and Tungusic populations, while the eastern component in European Finns, Vologda Russians and Chuvashs has relationships with Central Siberians such as Kets, Selkups, and Nganasans, all of which are missing in this paper.

Below is a re-edited ADMIXTURE plot from Dienekes:


Note: There are many ways to spell Sami. They used two a’s, but I find that confusing, so I just used one in my text.

Citation: Maki-Torkko, Elina, Aikio, Pekka, Sorri, Martti, Huentelman, Matthew J, & Camp, Guy Van (2010). A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies European Journal of Human Genetics : 10.1038/ejhg.2010.179

* Apparently “Lapp” is considered derogatory among Norwegians, though Finnish Sami refer to themselves as lappalainen. I will use Sami to avoid irritating Norwegian terminology police.

** I am implicitly excluding much of European Russia west of the Urals, but so be it.

November 3, 2010

The genetic heritage of Europe’s north

Filed under: Culture,Finland,Genetics,Historical Genetics,History,Sweden

If you haven’t, you should keep an eye on Dienekes‘  Dodecad Ancestry Project (RSS). The pilot phase of data collection is over, and the first population level statistics are now coming out. Of particular interest to me is a new analysis of various northern European ethnicities just published.

The samples used in this analysis are:

- 25 HapMap-3 White Americans. These are the Mormons of predominantly Northwest European heritage

- 5 Dodecad Project Finns

- 25 HGDP-CEPH Russians from Vologda, in north-central European Russia

- 12 Dodecad Project continental Germanics (Scandinavians and Germans)

- 10 Behar et al. (2010) Lithuanians

- 9 Behar et al. (2010) Belorussians

- 3 Dodecad Project Northern Slavs

Below are two visualizations of the genetic structure. First, an MDS. And second, a bar plot of ancestral quanta derived from ADMIXTURE. I’ve added some clarifying labels.



Remember that the data you input into these analyses shape the nature of the outcomes to some extent. All these populations are very genetically close when scaled to average worldwide inter-population genetic variation. So what Dienekes is smoking out here are subtle differences between relatively close groups.

The first clear result supports previous research using uniparental markers: the ethnogenesis of the “Great Russians” involved both demographic expansion, and, cultural assimilation. The process on the southern and eastern frontiers is well documented, because it continued into the early modern period via a series of private wars of expansion. Turkic and Ugric groups were defeated by “Cossacks”, and often themselves integrated into the Cossack population as it expanded further into Siberia and the Steppe. Lenin’s paternal grandmother for example is often claimed to have been a Kalmyk, a branch of the Dzungar Mongol Confederacy which had settled in the lower Volga region. Whatever the truth, Lenin’s father clearly had an Asiatic cast to his features. The ancestral quanta estimates always seem to show that Russians, though not other Slavs further to the west, seem to average around ~5% or so “eastern” ancestry (by analogy, this is about the amount of African ancestry in the typical Levantine Arab).

But the expansion into the Finnic north is less well documented. To some extent the process of Russification began far earlier, as even Kievan Rus at the turn of the first millennium has been claimed to have had Finnic elements (the Rus were Swedes, but they probably picked up Finns in their warbands as they swept south, in addition to the numerous indigenous Finnic groups in northeast Europe). Additionally, unlikely the Muslim Turks these Finnic groups were often small-scale societies without international connections or affiliation with any “higher civilization” which could serve as an oppositional ideology to Orthodox Russian culture. The wide geographic expanse of the Russian ethnos means that one must be exceedingly sensitive to sample representativeness. Readers of Russian or Finnish origin are often aware of which localities in northern Russia were only recently Slavicized, and so express caution in comments as to utilization of those samples as representatives of Slavs more generally.

The second peculiarity are the “Germans” who affiliate with the Finns in the MDS, and contribute to the Finnish element among the Germans. Dienekes says: “without revealing any information, I’ll just say that this is contributed primarily by 3 Dodecad Project members who deviate towards Finns and whose ADMIXTURE analysis shows a higher than expected Northeast Asian component. Their outlier status is also visible in the MDS plot.” By “Northeast Asian” he presumably means one of the 10 ancestral components he’d found in earlier analyses. Without any more information I assume there’s a high probability that these are simply Germanized part-Sami. Much of northern Scandinavia was inhabited by Sami down to the early modern period. For example, the Sami were ethnically cleansed and assimilated across the north half of what is today Sweden as late as the 1600s and 1700s. Though I haven’t done the requisite reading, I wouldn’t be surprised if this was just a function of more advanced farming techniques as well as hardy New World crops such as potatoes which pushed the possible limits of Swedish settlement north.

Finally, there’s a clear Finnic component in the results. As Dienekes noted this Finnic component itself may be a composite of East and West Eurasian elements, just as the South Asian component in Eurasia may be a composite of “Ancient North Indians” and “Ancient South Indians.” One thing to remember about the Finnic component is there’s evidence for a fair amount of genetic variation within Finland. Representativeness is probably key here, just as it is for Russians. Ethnic Finnish individuals with ancestry along the southern and western coasts probably have more affinity with Germanic populations than Karelians.

For many decades there have been arguments as to the provenance of the Finns. Specifically, are they outsiders to Norden who arrived from the east, bringing with them their language? Or are they are indigenous vis-a-vis Germanic speakers? The past is complex, so a simple model is going to shave off a lot of the detail, but I suspect that the truth is closer to the second. It seems that the Finnic groups, or at least their languages, have an ultimate origin in Central Eurasia after the last Ice Age. But they are possibly a circumpolar population which expanded north and practiced hunter-gatherer lifestyles following the ice sheets. Over time agriculturalists expanded north and squeezed them on the margin, but I believe there were natural ecological limits to the practice of techniques derived from Middle Eastern crops. Though northern Finns adopted some agricultural techniques, there was enough of a slowdown of the spread agriculture by Indo-European speakers and their precursors that they managed to hold their own in the north. In much of European Russia, and later in pre-19th century Finland, we see plenty of evidence of language-switching from Finnic to Indo-European (in Finland nationalism resulted in a back-switch over the past 150 years). If the Malthusian pre-modern age had persisted for another two or three centuries I would not be surprised if Finnic languages were totally absorbed by Russian and Scandinavian Indo-European dialects. As it is, 19th century language based nationalism stopped the process of elite culture assimilation, and in some cases reversed it (many elite Finland Swedes abandoned Swedish language and identity in the 19th and early 20th centuries).

Addendum: The picture I present above is simple, and I don’t believe it captures a lot of what happened. For example, from my reading there was a pause of about 1,000 years in the expansion of agriculture once it reached the Kattegat between Denmark and Sweden. I suspect that these long pauses were a function of ecology and geography, as they’re often just too long to be determined by social-political inertia. Additionally, it seems unlikely to me that the first agriculturalists in Europe were Indo-European speakers. Rather, that is possibly a subsequent linguistic overlay, especially in the western regions of Europe.

July 29, 2010

Finland, still going its own way

Filed under: Finland,Genetics

Dienekes points to a new paper which highlights genetic variation in Fenno-Scandinavia (or in this case, Finland, Sweden and Denmark). A two-dimensional plot with the variation is pretty illustrative of what you’d expect:


Finns are genetic outliers in Europe, to some extent even in comparison to Estonians, who speak a very similar language. But, I wonder if the situation will change a bit when we have more samples from Finnic populations of northern Russia. Remember that the nature of these representations is sensitive to the variation which we throw into the equation in the first place.

