Razib Khan One-stop-shopping for all of my content

November 24, 2017

Soft selection for gentleness in Puerto Rican African Honeybees

Filed under: Population genetics,Population genomics,Soft Selection,Soft Sweep — Razib Khan @ 3:07 pm


When I was a kid “killer bees” were a major pop culture thing. There were movies about the bees, and we would get updates about their march northward in the news. They were a cautionary tale of our species’ hubris.

Today we have a little bit more perspective. These bees were actually just African honeybees, the ancestral population to European honeybees, which were introduced to the New World with Europeans centuries earlier than the African honeybees. African honeybees were not that different from European honeybees, but they were more aggressive and tended to outcompete European honeybee colonies. They are a major problem for the beekeeping industry, but not a major threat to human life.

Today the African and European populations in the United States seem to have stabilized in their ranges, with a hybrid zone between them. African bee’s migratory behavior makes them less competitive with European bees in colder climates.

A friend of mine once mentioned to me that if he had to do it all over again he would do research on the evolutionary genomics of Hymenoptera, and in particular bees. People care about bees. So it ‘s no surprise that I noticed this paper out in Nature Communications, A soft selective sweep during rapid evolution of gentle behavior in an Africanized honeybee:

Highly aggressive Africanized honeybees (AHB) invaded Puerto Rico (PR) in 1994, displacing gentle European honeybees (EHB) in many locations. Gentle AHB (gAHB), unknown anywhere else in the world, subsequently evolved on the island within a few generations. Here we sequence whole genomes from gAHB and EHB populations, as well as a North American AHB population, a likely source of the founder AHB on PR. We show that gAHB retains high levels of genetic diversity after evolution of gentle behaviour, despite selection on standing variation. We observe multiple genomic loci with significant signatures of selection. Rapid evolution during colonization of novel habitats can generate major changes to characteristics such as morphological or colouration traits, usually controlled by one or more major genetic loci. Here we describe a soft selective sweep, acting at multiple loci across the genome, that occurred during, and may have mediated, the rapid evolution of a behavioural trait.

Come for the bees, but stay for the soft selection! If you talk to anyone in evolutionary and population genomics you know that the future is in understanding patterns of soft selection and polygenic selection from standing variation. Though these are related phenomena which are associated with each other, all are all distinct.

Standing variation just refers to the diversity which is segregating in the population at any given time. At any given moment many loci exhibit polymorphism. This polymorphism can be a target of natural selection if it is correlated with heritable variation and differentials in fitness. Though soft selection can be quite wooly it’s inverse, hard selection, is clear: in genetic terms hard selection can be seen in allele frequency changes at a single variant in a locus, going from the point where it is a novel mutation to nearly fixed in the population. In Haldane’s original conception hard selection involved excess deaths, and imposed a limit on the rate of evolution as well as the amount variation you could expect within a given population. This model was convenient in the pre-genomic and early genomic era because empirical selection tests had to focus on large allele frequency changes around singular loci. Researchers didn’t have large numbers of whole-genome samples available (nor the computational ability to analyze them).

Today this is not a limitation. In the analysis above the authors had 30 individuals of the 3 populations sequenced at high quality (20x). They ended up with millions of genetic variants they could analyze.

The plot to the left shows that “gentle African honeybees” (gAHB) tend to be closer to the African honeybee populations (AHB) overall (though with some hybridization with European honeybees, EHB). This is not surprising.

But the key observation was that over 12 generations the African honeybees of Puerto Rico became progressively less aggressive, despite maintaining overall morphological similarities to the mainland Mexican African bees from which they likely derive. Though buried in the discussion, there is a rationale for why this morphological change may have occurred: the Puerto Rican bees are subject to a lot of negative selection against aggression because of the density of the island, as well as the reality that aside from humans there aren’t other many species where their aggressive tendencies are beneficial. Basically, if you are an aggressive colony, it’s harder to make a go in densely settled areas (the implication here then is that there are probably “gentle” African honeybee populations across Latin America, they just are never disaggregated from the broader meta-population).

Credit: Phillip Messer and Nandita Garud

It’s the genomics where the real evolutionary insight comes in: they found that there were multiple soft sweep events around genetic regions implicated in behavior. In their overall genome the gAHB of Puerto Rico resembled mainland AHB, but in this subset of genetic loci they resembled EHB. Many of these loci had also been known to be targets of selection when the original European bee population diverged from the ancestral African population. Basically this is a genomic illustration of convergent evolution.

Regular readers of this blog will recognize the ways they detected selection. They used a modified form of EHH, which is reasonable since the selection event was recent enough to have been associated with distinct haplotype blocks. Also, standard Fst analysis showed that these were outliers in relation to the broader genetic pattern of relatedness (these loci were more like EHB than AHB, while most loci were more like AHB than EHB).

So this a form of polygenic selection. Remember, natural selection only knows genes through the phenotype (with intra-genomic selection being an exception). A behavior like aggression is probably subject to the fourth law of behavior genetics. That is, variation won’t be defined around a single genetic locus. Rather, variation across the genome will be correlated with variation in the phenotype. As selection favors a particular value of the phenotype across the distribution the allele frequencies across many genetic loci will shift, but they will not necessarily fix. Polygenic selection operates on the dispersed standing genetic variation which explains much of the variation of the phenotype in question. Instead of total sweeps to fixation due to large fitness differences between a given allele and its alternative form, the selection impact is distributed and diffused across the genome.

Though most of the genetic variants seem to recapitulate the evolution of the less aggressive phenotype that occurred with the original migration north of African honeybees, some of the selection signatures were novel. This points to the reality that when you have soft selection on standing variation you may have similar phenotypes which evolve via different means. Additionally, the authors noted that these results were in contrast to controlled breeding experiments in mammals where selection for gentility (“domestication”) often targeted a few loci and exhibited strong pleiotropic effects (due to the genetic correlation). These results point to the limitations of inferences made from human-directed selection.

Soft selection is probably ubiquitous. Consider the evolution of skin color in humans. There are lots of variants and lots of variation, and most of the variation seems to be ancestral. Only at the locus SLC24A5 do you have a perfect illustration of a hard selective sweep, probably from a de novo mutation that emerged around the Last Glacial Maximum.

From a geneticists’ perspective evolution is basically conceived of as changes in allele frequencies over time. Much of this is due to natural selection. Now that the world of soft selection is opening up, I suspect that we’ll understand a lot more of what we see around us, at least in the generality.

Citation: A soft selective sweep during rapid evolution of gentle behaviour in an Africanized honeybee.

September 16, 2010

A fly’s life: adventures in experimental evolution

509px-Drosophila_residua_heNatural selection happens. It was hypothesized in copious detail by Charles Darwin, and has been confirmed in the laboratory, through observation, and also by inference via the methods of modern genomics. But science is more than broad brushes. We need to drill-down to a more fine-grained level to understand the dynamics with precision and detail, and so generate novel inferences which may then be tested. For example, there are various flavors of natural selection: stabilizing selection, negative selection, and positive directional selection. In the first case natural selection buffets the phenotype about an ideal mean, in the second case deleterious phenotypes and their associated alleles are purged from the genome, and finally, natural selection can also drive a novel trait toward greater prominence, and concomitantly the allelic variants which are associated with the fitter phenotype.

The last case is of particular interest to many because it is often with positive natural selection by which evolution as descent with modification occurs. Over time trait values and the nature of traits themselves shift such that a lineage changes its character beyond recognition. This phyletic gradualism and the scale independence of evolutionary process has been challenged, in particular from the domain of developmental biology (albeit, not all ,or even most, developmental biologists). But ultimately no one doubts that a classical understanding of evolution as change in allele frequency, often driven by natural selection, is part of the larger puzzle of how the tree of life came to be.

ResearchBlogging.orgOne of the phenomena associated with positive directional evolution is the selective sweep. How a selective sweep occurs, and its consequences, are rather straightforward. A genome consists of a sequence of base pairs (e.g., we have 3 billion base pairs). If a new mutation emerges at a particular base pair, a novel single nucelotide polymorphism (SNP), and, that allelic variant is ~10% fitter than the ancestral variant, natural selection could drive up its frequency (the conditionality is due to the fact that in all likelihood it would still go extinct because of the power of stochastic forces when a mutant is at low frequency). So the variant could in theory shift from ~0% (1 out of N, N being the number of individuals in a population, 2N if diploid, and so forth) to ~100%. This would be the fixation of the novel variant, driven by selective dynamics. So what’s the sweep aspect? The sweep in this case refers to the effect of the very rapid rise in frequency of the SNP in question on the adjacent genomic region. What is termed a genetic hitchiking dynamic results if the sweep occurs rapidly, so that nearby regions of the genome also move to fixation along with the favored SNP. But in a diploid organism with sexual reproduction genetic recombination persistently breaks apart associations across the physical genome. Therefore the span of the sequence of genetic markers nearby a favored SNP which form a haplotype is dependent on the rate of recombination as well as the rate of the rise in frequency of the allele, which is contingent on the strength of selection. A powerful selective sweep has the effect of homogenizing wide regions of the genome flanking the favored mutant; in other words the sweep “cleans” the gene pool of variation as one very long haplotype replaces many shorter haplotypes. As an example, in the genomes of Northern Europeans the locus LCT is characterized by a very long haplotype, which itself seems to correlate well with the trait of lactase persistence. The implication here is that the lactase persistence conferring variant arose relatively recently, and was swept up to near fixation by positive directional natural selection.

That’s the broad theory. But as you know, evolution and its subcomponents are more than “just a theory,” they’re a set of models which are amenable to testing, whether through observation, or via controlled laboratory experiments. A new letter to Nature elaborates how exactly selective sweeps play out in Drosophila melanogaster, a classic “model organism.” Interestingly, this is a case of experimental evolution, something we are more familiar with Richard Lenski’s E. coli. Genome-wide analysis of a long-term evolution experiment with Drosophila:

Experimental evolution systems allow the genomic study of adaptation, and so far this has been done primarily in asexual systems with small genomes, such as bacteria and yeast…Here we present whole-genome resequencing data from Drosophila melanogaster populations that have experienced over 600 generations of laboratory selection for accelerated development. Flies in these selected populations develop from egg to adult ~20% faster than flies of ancestral control populations, and have evolved a number of other correlated phenotypes. On the basis of 688,520 intermediate-frequency, high-quality single nucleotide polymorphisms, we identify several dozen genomic regions that show strong allele frequency differentiation between a pooled sample of five replicate populations selected for accelerated development and pooled controls. On the basis of resequencing data from a single replicate population with accelerated development, as well as single nucleotide polymorphism data from individual flies from each replicate population, we infer little allele frequency differentiation between replicate populations within a selection treatment. Signatures of selection are qualitatively different than what has been observed in asexual species; in our sexual populations, adaptation is not associated with ‘classic’ sweeps whereby newly arising, unconditionally advantageous mutations become fixed. More parsimonious explanations include ‘incomplete’ sweep models, in which mutations have not had enough time to fix, and ‘soft’ sweep models, in which selection acts on pre-existing, common genetic variants. We conclude that, at least for life history characters such as development time, unconditionally advantageous alleles rarely arise, are associated with small net fitness gains or cannot fix because selection coefficients change over time

Critical to understanding what’s going on here is the distinction they make between ‘classic’ ‘hard sweeps’ and ’soft sweeps.’ Hard sweeps follow the spare description I outlined above:

1) A new mutant arises in the genetic background

2) Selection favors the mutant

3) The mutant rises in frequency and sweeps to fixation, 0% → 100%, replacing the ancestral variants

In contrast, for a soft sweep:

1) Selection favors a set of minor polymorphisms already segregating in the gene pool

2) These polymorphisms rise in frequency

3) But they may not sweep to fixation

In the first case the signature of natural selection will be clear, distinct, and indubitable. A novel haplotype which has replaced the ancestral variants and produced a wide region of genetic homogeneity as all other allele states are expunged by the sweep will have resulted. That isn’t what they saw at the genomic level.

phendiffBut first, what did they do? The flies used in this experiment derive from a 30 year old lineage, and they selected them for 600 generations in the case of the treatments which were being driven to new phenotype values. 600 generations for humans would be about 15,000 years assuming 25 years per generation. If a trait is heritable, and you select offspring deviated away from the mean, over time you will see a shift in the trait value. This is classic quantitative genetics, and that’s what they saw. They had five lineages which exhibited accelerated development (ACO), and five which were controls which exhibited the ancestral phenotypes (CO). “Eclosion” refers to the fly’s emergence from the pupae. The lineages which were subject to natural had very different life histories from the control groups. The cluster of traits here shouldn’t be too surprising, we know from other taxa that short-lived fast-developing species tend to be smaller and metabolically more under-the-gun than the inverse.

But the real interesting aspects of this study are not the phenotypes. Who hasn’t seen weird things among the Drosophila? That’s one of the reasons they were chosen as model organisms in the first place! Rather, they explored the patterns of genomic variation within and across the lineages, and integrated the results into a broader theoretical framework of how evolutionary processes occur, and their implications for the genome-wide structure one should see. Below I’ve stitched together figure 2 & 3, which illustrate particular patterns of genomic variation.

compfig

The left figure shows differences in allele frequencies between the ACO and CO pooled lineages. The spikes indicate large differences, with the dotted line representing the threshold where there’s a 0.1% random chance of such a between population frequency difference. The vertical axis is log-scaled. The grey line at the bottom indicate the differences in one particular ACO lineage with the pooled ACO sample. In the right panel you see heterozygosities, with blue denoting the CO lineages, and red the selected ACO lineages which have shortened life histories. The grey again is a particular ACO lineage. Each vertical panel corresponds to a chromosomal arm of the the Drosophila melanogaster genome.

First, note the widespread distribution of allele frequency differences between ACO and CO. Additionally, there’s little difference between the specific ACO lineage, and the pooled sample. Despite their independent histories they seem to exhibit the same allelic configuration. Second, note that the heterozygosities in the case of the ACO pooled sample is lower than in the CO ancestral phenotype lineages. Why? Remember that selective sweeps should expunge genomic variation. But, the sweeps do not seem to have gone to fixation, otherwise we’d see many more inverted peaks converging to heterozygosity of ~0, as the selected variant replaces all others in the population.

What’s going on in the regions which exhibit differences between the controls and selected linages? They looked at the ~650 non-synonymous SNPs on ~500 genes which were most differentiated between ACO and CO (L10FET score > 4) and found the following categories of genes enriched: imaginal disc development, smoothened signalling pathway, larval development, wing disc development, larval development (sensu Amphibia), metamorphosis, organ morphogenesis, imaginal disc morphogenesis, organ development and regionalization. Life history is complex. Combine the wide class of genes with the dispersed genomic impact of selection as evident in figures 2 and 3, you get a good sense of the sort of consequences on the substrate level which quantitative genetic evolutionary dynamics have. Also of interest, they found that the X chromosome seemed enriched for signatures of selection and evolution. Why? They note that this chromosome would be more subject to selection for recessive or partially recessive expressing SNPs.

Clearly this study did not find the clean hard sweeps which theory may have predicted. Rather, the researchers found a lot of partially completed sweeps distributed all across the genome. Sound familiar? Before we move on to broader considerations, here are their explanations:

- The sweeps are hard, but haven’t reached fixation. So the selection coefficients have be rather small for them to still be in transient

- Selection is operating on “standing variation.” That is, the genetic variation extant naturally within a given population, and which may be operated upon by natural selection to change the population trait value mean through classical breeding techniques

- And finally, selection coefficients (the greater fitness of positively selected variants against the population mean) may not be static parameters, but change over time as a function of allele frequency. This shouldn’t be that surprising. Frequency dependence and epistasis can impact on linear assumptions within a statistical genetic model. The authors refer to deleterious alleles or antagonistic pleiotropy as possible genetic level forces which also prevent fixation

I personally lean against the first option, because it seems like we see a similar pattern in human evolutionary genomics, lots of partial sweeps and incomplete fixation. How much time does a brother need? In the long run we’re dead, and heat death swallows the universe. In the short run evolutionary pressures are always shifting. Fix now, or forget it say I! The wide distribution of allelic differences as well as moderate heterozygosities seems to be an indication that a quantitative trait, life history, is being modified through mass action on genetic variation. Interestingly, there’s also the parallel to humans insofar as the X chromosome seems to have more signatures of selection and variation in this evolutionary experiment. Next question: who’s working on experimental evolution of 600 generations in mice?

Citation: Burke, Molly K., Dunham, Joseph P., Shahrestani, Parvin, Thornton, Kevin R., Rose, Michael R., & Long, Anthony D. (2010). Genome-wide analysis of a long-term evolution experiment with Drosophila Nature : 10.1038/nature09352

Image Credit: Karl Magnacca

Powered by WordPress