Razib Khan One-stop-shopping for all of my content

August 11, 2010

The dog’s world of large effect QTLs

Szusza_pekingeseA major issue in human genomics over the past few years has been the case of the “missing heritability“. Roughly, we know that for many traits, such as height, most of the variation in the trait within the population is controlled by variation in the genes of the population. The height of your parents is an extremely good predictor of your height in a developed nation. If you’re adopted, the height of your biological parents is an extremely good predictor of your height in a developed nation, not the height of your adoptive parents. Though a new paper claims to have resolved some of the difficulty, one of the major issues in human height genetics has been the lack of large effect quantitative trait locus. In plain English, a gene which can explain a lot of the variation in the trait. Rather, many have posited that continuous quantitative traits like height are controlled by variation in innumerable common genes of small effect size, or, by innumerable rare genes of large effect size. The same may be an issue with personality genetics, or so is claimed by a recent paper unable to find common variants (though an eminent geneticist pointed out in the comments some problems with the paper itself).

ResearchBlogging.orgOne would assume that the same problem would crop up across the tree of life. But a geneticist once told me that he considered biology the science where all rules have exceptions. Many exceptions. A new paper in PLoS Biology paints a fundamentally different picture of the genetic architecture of many morphological traits in the domestic dog, A Simple Genetic Architecture Underlies Morphological Variation in Dogs:

Dogs offer a unique system for the study of genes controlling morphology. DNA from 915 dogs from 80 domestic breeds, as well as a set of feral dogs, was tested at over 60,000 points of variation and the dataset analyzed using novel methods to find loci regulating body size, head shape, leg length, ear position, and a host of other traits. Because each dog breed has undergone strong selection by breeders to have a particular appearance, there is a strong footprint of selection in regions of the genome that are important for controlling traits that define each breed. These analyses identified new regions of the genome, or loci, that are important in controlling body size and shape. Our results, which feature the largest number of domestic dogs studied at such a high level of genetic detail, demonstrate the power of the dog as a model for finding genes that control the body plan of mammals. Further, we show that the remarkable diversity of form in the dog, in contrast to some other species studied to date, appears to have a simple genetic basis dominated by genes of major effect.

The paper uses powerful statistical and computational techniques, but the main results are relatively straightforward (assuming you don’t get stressed out by terms such as “random effect in the linear mixed model”). First, they delved a little into the evolutionary history and the general topography of the genomics of various dog breeds, wolves, as well as stray “village dogs” (I assume these are simply these are like the “pariah dogs” of India). Though village dogs had domestic ancestors they’ve gone feral, so they’re an interesting contrast with the new breeds created since the 19th century, as well as the wild ancestors of all dogs, wolves.

Three statistics were used to explore demographic history: linkage disequilibrium (LD), runs of homozygosity (ROH), and haplotype diversity. Inbred individuals have many ROH. They may have one individual show up relatively recently in their ancestry over and over, so it makes sense that they’d have many loci where both copies of the gene are identical by descent and state. Obviously purebred dogs have high ROH. They also have low haplotype diversity. Even the average person on the street is familiar with the freakish inbreeding which goes into the production of many purebred canine lineages, and their lower life expectancy vis a vis the maligned “mutt.” LD decayed much more quickly in wolves than in the dogs, village and purebred. Remember that LD indicates correlations of alleles across loci. It can be caused by selection at a SNP, which rises in frequency so quickly that huge swaths of the adjacent genome of that particular SNP “hitchhike” along before recombination can break up the association to too great an extent. Admixture between very distinctive populations can also produce LD, which again will decay with time due to recombination. Finally, another way LD can occur is through bottlenecks, which like positive selection can increase particular gene frequencies and their associated genomic regions rather rapidly through stochastic processes. It is the last dynamic which probably applies to all dogs: they went through a major population bottleneck during the domestication process, so the genomic pattern spans village and purebred lineages since it is an echo of their common history. Finally haplotype diversity is simply ascertaining the diversity of haplotypes across particular genomic windows. An interesting find in these results is that village dogs actually have lower ROH and higher haplotype diversity than wolves. That suggests that the wolves in this sample went through a major population bottleneck, while village dogs have maintained a larger effective population.

A general finding from the aforementioned examination is that different breeds tended to be genetically rather distinct. This follows naturally from the origin of modern purebreds as tight and distinct inbred lineages. This genome-wide distinctiveness though is a perfect background condition to test for similarities within the genome which correlate with specific morphological similarities across the breeds. And they did find quite a bit:

We searched for the strongest signals of allelic sharing by scanning for extreme values of Wright’s population differentiation statistic FST…cross the breeds. The 11 most extreme FST regions of the dog genome contained SNPs with FST≥0.57 and minor allele frequency (MAF [major allele frequency -Razib])≥0.15 (Table 1). Six of these regions are strongly linked to genetic variants known to affect canine morphology: the 167 bp insertion in RSPO2 associated with the fur growth and texture…an IGF1 haplotype associated with reduced body size…an inserted retrogene (fgf4) associated with short-leggedness…and three genes known to affect coat color in dogs (ASIP, MC1R, and MITF…Two other high FST regions correspond to CFA10.11465975 and CFA1.97045173, which were associated with body weight and snout proportions, respectively, in previous association studies….Two known coat phenotypes (fur length and fur curl…) also exhibited extreme FST values. Only a limited number of high FST regions were not associated with a known morphological trait (Figure 2, black labels). Here, we focus on illuminating the potential targets of selection for these regions as well as identifying genomic regions that associate with skeletal and skull morphology differences among breeds.

Many of these genes are familiar to you in all likelihood because they have the same functional significance in humans. The key difference is effect size. Since the paper is open access I’ll spare you the alphabet soup of genes and their association with canine morphological traits. There are many of them that pop up by examining differences between breeds in morphology (and similarities) and their allele frequencies. The top line is the prediction of trait which can occur via just a few genes. They constructed a regression model where a set of independent variables, genes, can predict the value of a given dependent variable, the trait:

Using forward stepwise regression, we combined potential signals into a multi-SNP predictive model for each trait. In the models of body weight, ear type, and the majority of measured traits, most of the variance across breeds could typically be accounted for with three or fewer loci…Correlated traits (e.g., femur length and humerus length) yielded similar SNP associations. For the 55 traits, the mean proportion of variance explained by the top 1-, 2-, and 3-SNP models was R2 = 0.52, 0.63, and 0.67, respectively….After controlling for body size, mean proportion of variance explained by these models was still appreciable—R2 = 0.21, 0.32, and 0.4, respectively.

R2 indicates the proportion of variance in the dependent variable explained by variance in the independent variables. The values for this model are very high. By contrast, a gene for height in humans is a find if it can explain 2% in the trait value variance.

The above found SNPs which could explain variation across breeds which are inbred and highly distinctive in genes and traits. Could the same SNPs explain variance within breeds? Yes:

Most of the variance in body size was explained by the IGF1 locus where we observe a single marker with R2 = 50% and R2 = 17% of variance in breed and village dogs, respectively. The top 3-SNPs explain R2 = 38% of the variance in body weight in village dogs, although the 6-SNP model explains less. The lower R2 in non-breed dogs than breed dogs may be a consequence of lower LD observed in village dogs reducing the strength of association between these markers and the causal body size variants. Alternatively, the lower R2 may also be a consequence of non-genetic factors such as diet or measurement error affecting the observed village dog weights, the smaller range of body sizes observed in the non-breed dog sample, or perhaps to overfitting of the model based on the particular breeds included in the dataset. Nevertheless, R2 = 38% is significantly better than association scans for morphometric traits in humans utilizing denser marker arrays….

Dogs and humans have a long history together. But some of these dogs have a very short history. As noted in the discussion many canine lineages which are purebred are products of Victorian era breeding crazes, and were selected for strange characteristics which were transmitted in a discrete fashion. The recency of the lineages combined with the peculiarities of the breeding programs of this era and dog fanciers generally may explain some of the genetic architecture of canines. The authors note that domestic animals subject to more gradual selection may not, and do not, exhibit the same tendency. Perhaps humans are more like goats or wheat, and less like dogs? The authors note the contrast in loci which exhibit population wide variation:

In humans, high-FST regions are associated with hair and pigmentation phenotypes, disease resistance, and metabolic adaptations…In contrast, the strongest signals of diversifying selection in dogs are all associated with either body size/shape or hair/pigmentation traits, and therefore are unlikely to have been under selection for disease resistance, metabolic adaptations, or behavior. In total, the 11 highest FST regions identified across purebred dogs are all associated with body size/shape or hair phenotypes, including three genomic regions that had not been detected in previous association studies.

The rationale for this study is the utility of dogs as model organisms for humans. They’re taxonomically rather close to us, so their genetics may give us insight into human conditions. The main worry though for me is that the best models here are inbred dogs, where the markers adduced are most valid, but it seems possible they’re the least promising set of models because they have all sorts of genetic peculiarities. But all practicality aside, a fascinating paper.

Image Credit: Jon Radoff and Angela Bull in 2002

Citation: Boyko AR, Quignon P, Li L, Schoenebeck JJ, Degenhardt JD, & et al. (2010). A Simple Genetic Architecture Underlies Morphological Variation in Dogs PLoS Biology : 10.1371/journal.pbio.1000451

Powered by WordPress