Razib Khan One-stop-shopping for all of my content

May 27, 2010

The genes in Spain fall rather evenly

Filed under: Genetics,Genomics,Spain — Razib Khan @ 3:23 am

A new paper is out which drills down a bit on the genetic substructure in Spain. Genetic Structure of the Spanish Population:

Genetic admixture is a common caveat for genetic association analysis. Therefore, it is important to characterize the genetic structure of the population under study to control for this kind of potential bias.

In this study we have sampled over 800 unrelated individuals from the population of Spain, and have genotyped them with a genome-wide coverage. We have carried out linkage disequilibrium, haplotype, population structure and copy-number variation (CNV) analyses, and have compared these estimates of the Spanish population with existing data from similar efforts.

In general, the Spanish population is similar to the Western and Northern Europeans, but has a more diverse haplotypic structure. Moreover, the Spanish population is also largely homogeneous within itself, although patterns of micro-structure may be able to predict locations of origin from distant regions. Finally, we also present the first characterization of a CNV map of the Spanish population. These results and original data are made available to the scientific community.

They used a 160 K SNP-chip for this, though for the PC charts below they were constrained to ~100,000 SNPs. Nothing too revolutionary in the paper. The fact that Spaniards have more haplotype diversity vis-a-vis the “CEU” sample in the HapMap, which consists of Utah Mormons, isn’t too surprising, since those individuals are Northern European and Northern Europeans tend to be a touch less diverse than Southern Europeans (more heterozygosity in Southern Europe than in the North). A common explanation for this is that Northern European populations emerged as subsets of Southern populations which expanded out of Ice Age “refugia” within the last ~10,000 years or so, and this migratory process would have induced some bottlenecks and so reduced their diversity. The findings in this paper are broadly consistent with the idea that Spain was a refugium, and so one of the sources of the population of Northern Europe. But, note that there are lots of controversies about recent European demographic history right now, so I wouldn’t take the aforementioned model as a given. Also, one major issue which sticks out is the lack of Basque populations in the sample, since that’s a group which has long been of interest, and some aspects of many demographic scenarios hinge on their nature. No surprise that Visigoths, Berbers and Arabs didn’t perturb these results too much. I believe that these groups did arrive in Iberia in large numbers, but on a relative scale their proportions were small and they probably didn’t alter Spain’s basically genetic character.

Below are some charts of note.

First, dimensions of genetic variation in the Spanish population using 100,000 SNPs by locality where the sample was collected.


If some of the localities were as obscure to you as me, here is the PC chart with a subset of them rotated and superimposed upon a map of Spain.


Finally, here are the Spanish samples plotted in relation to two HapMap populations, the CEU (American whites of Northern European ancestry) and TSI (Tuscans from Italy).


There are a few outliers here in relation to their putative population cluster, but in general the three groups are nicely separated as expected. Spain is bounded by water and a rather imposing mountain range. These serve as natural barriers to gene flow. But within the peninsula it’s dominated by a high plateau. I really don’t have an intuitive understanding of whether the spatial distribution of Spain’s people (which for ecological reasons probably can be extrapolated back to antiquity) should homogenize it through a circular pattern of gene flow, but perhaps that’s what these data are showing. I am a bit wary of saying that Spain is internally homogeneous without referencing other European populations of the same scale in detail. Perhaps what this group found is what you’d find on this scale with this chip; not much.

Cite: BMC Genomics 2010, 11:326doi:10.1186/1471-2164-11-326

H/T Dienekes

Powered by WordPress