The Future of Evo-Devo 2012

The University of Oregon’s NSF IGERT program in evolution, development, and genomics invites you to attend The Future of Evo-Devo 2012, a symposium about genomes in context of systems, populations, and the environment.

The symposium is in Portland, OR. February 10-12, 2012.  Limited scholarships are available for students!  Speakers, details, and more information is here:

The Individual in the Genomic Era

The grad students in this department (CEEB) are organizing a public seminar series this spring about how genomic technology is developed and applied to issues in human health.
The series is targeted towards undergraduates and the rest of the Eugene & surrounding community.  Bill Cresko will open the series on April 5 and give a brief tutorial on how we acquire genomic information) and will then go on to explore how this information has changed the way we view our world, the history of hominid evolution, and how we think about medical care.

Lee Silver, from Princeton University, will talk on May 3 specifically about Personal Genomics, personalized medicine and genetic testing.

Carlos Bustamante, from Stanford University, will close the series on May 23 by speaking about how genome variation data has been used to identify the evolution of different traits in domesticated animals (specifically dogs) and how we can use that same type of technology to map migration of humans over the past several millennia as well as explore the genetic basis to physical differences among humans.

We realize most of our readers won’t be able to attend, but we are planning on podcasting & videotaping the lecutres to post on the seminar website.  We’ll note on here when we update so keep your eyes peeled!

Genomic patterns of pleiotropy and the evolution of complexity (Wang et. al 2010)

Posted by Victor Hanson-Smith, Conor O’Brien, and Bryn Gaertner.

One of the grand challenges of evo-devo is to understand how mutations of genetic sequences affect concomitant phenotypic traits.  Eighty-one years ago, Fisher (1930) proposed that every mutation may affect every trait, and the effect size of a gene on a trait is uniformly distributed: thus we should observe equal proportions of mutations causing large and small per-trait effects.  As a logical consequence of Fisher’s hypothesis, more complex organisms (that is, with more traits) should evolutionarily adapt to their environment at a slower rate than less complex organisms because the presence of more traits implies a higher density of gene-trait relationships and thus incurs a “cost of complexity” (Orr 2000).  However, it is widely accepted that organisms *do* evolve to be more complex, and populations of complex organisms successfully evolve towards fitness optima.  This implies the “cost of complexity” hypothesis is incorrect, or the cost is counteracted by some unknown force.

In contrast to a Fisherian view, contemporary evo-devo research widely accepts the general principle that genes interact in hierarchical modules to produce morphological and physiological traits.  A network-centric perspective of gene-trait interactions suggests that the effect of a particular mutation on downstream traits depends on the network location of the mutated gene: mutations in genes with high network centrality tend to be more pleiotropic because those genes affect many downstream traits, whereas mutations to peripheral genes are less pleiotropic.  However, the extent of modularity and pleiotropy across genomes is unknown.

A recently-published PNAS paper (Wang et al., 2010) repudiates the Fisher-Orr “cost of complexity” hypothesis and confirms contemporary intuition regarding genetic modularity using empirical data and an extension of an exiting model of adaptation.

Wang et al. analyzed genome-wide patterns of pleiotropy in three eukaryotes—yeast, mice, and nematodes—and observed significant modularity in the gene-trait relationship graph and generally low levels of pleiotropy for most genes.  This highly modular structure and generally low pleiotropy means that a mutation is more likely to be beneficial, as it is more likely to affect a small, related set of phenotypes in the same direction, as opposed to many phenotypes in random directions.

Moreover, the authors observed that pleiotropic mutations tend to have a larger per-gene effect than non-pleiotropic mutations.  By extending Orr’s “complexity cost” equation to allow for variable levels of pleiotropy, Wang et al. observed a small non-zero degree of pleiotropy actually increases—rather that impairs—the rate of adaptation. This is because the positive correlation between pleiotropy and effect size increases the probability of fixation and fitness gain in more complex organisms, i.e., those with greater complexity.  This result is important because it may explain the repeated evolution of complexity in many taxa.

Wang et al.’s analysis is based entirely on data mined from knock-out and RNAi experiments; their conclusions are consequently limited to the sequence space of null mutations that silence the function(s) of genes.  In contrast, a less-explored region of sequence space contains mutations that merely affect the relative activity of a gene’s protein product without entirely silencing the gene.  In non-null sequence space, the magnitude of a mutation’s effect is determined not only by the pleiotropy (a.k.a. the network centrality) of the mutated gene, but also the number of redundant pathways leading from that gene to a downstream phenotype.  It is widely accepted that pathway redundancy buffers traits from upstream changes in enzyme activity or dosage [see Kacser and Burns, “The Molecular Basis on Dominance”, Genetics 1981].  Whereas the effects of null mutations are strongly predicted by the extent of pleiotropy (as presently shown by Wang et al.), we hypothesize that the effect of a non-null mutation is largely predicted by the number of interaction pathways between the mutated gene and a downstream phenotype.  This counterhypothesis, however, has yet to be tested.

Read the paper by Wang et al., here:

Wang Z, Liao BY, & Zhang J (2010). Genomic patterns of pleiotropy and the evolution of complexity. Proceedings of the National Academy of Sciences of the United States of America, 107 (42), 18034-9 PMID: 20876104

New papers about developmental stochasticity, species delimitation, and functional genomic neighborhoods.

posted by Victor Hanson-Smith
Here are three articles—published this week!—that might be relevant to your interests.

1. Stochasticity versus determinism in development: a false dichotomy? Magdalena Zernicka-Goetz, et al., Nature Reviews Genetics, November 2010

The developmental trajectory (from embryo to death) of all multicellular organisms involves cell fate decisions, in which pluripotent, multipotent, and bipotent cells differentiate into specific cell types.  Cell fate decisions are usually controlled by spatiotemporal variation in protein expression levels; for example, the expression levels of transcription factors CDX2 and OCT4 in inchoate mouse embryos determines if the mouse cell becomes an inner-cell mass (ICM) cell or a trophectoderm (TE) cell.  The cells fated to be ICM and TE create a seemingly random heterogenous (“salt-and-pepper”) spatial pattern, leading researchers to conclude that the earliest fate decisions occur stochastically and without regulatory bias.

In this short NRG opinion piece, the authors assert that many seemingly stochastic developmental processes could actually be completely deterministic.  In their words:

. . . a multi-step process with deterministic causation can be so complicated as to be practically unpredictable. . . The outcome then seems lawless, but may not be.

The authors ask, “to what extent is the non-deterministic ‘noisy’ component of developmental control due to true, inevitable ontological randomness and to what extent is it due to epistemological unpredictability because of missing information on the complicated history of cells?”  Although the answer remains elusive for now, this general philosophical framework should guide future investigations about molecular developmental determinants.  I would like to curate a small reading list on this topic, and I encourage you suggest research papers (in the comments field below) that demonstrate regulatory determinism within seemingly-random developmental trajectories.

Zernicka-Goetz M, & Huang S (2010). Stochasticity versus determinism in development: a false dichotomy? Nature reviews. Genetics, 11 (11), 743-4 PMID: 20877326

2. Species delimitation using dominant and codominant multilocus markers, Bernhard Hausdorf and Christian Hennig, Systematic Biology, October 2010

How do we determine the genetic boundaries between species?  The problem of species delimitation can be challenging due to admixture between incipient sister species and due to discordance between gene trees and species.  For example, species in the earliest stages of speciation will differ by only a few genes (presumably genes responsible for reproductive isolation or differential adaptation), and it can be difficult to detect speciation in these situations.

In this paper, the authors propose a method for species delimitation based on Gaussian clustering (also known as mixture modeling). They compare their method to results from the programs STRUCTURE (Pritchard 2000) and STRUCTURAMA (Huelsenbeck and Andolfatto 2007), which group individuals such that Hardy-Weinberg equilibrium is maximized within each cluster.

The authors observed the accuracy of species delimitation depends on the dominance of genetic markers.  The authors analyzed four AFLP datasets, two of which contain dominant genetic markers and two containing co-dominant markers.  When using dominant markers, Gaussian clustering was the most accurate method; when using codominant markers, STRUCTURAMA was the most accurate.  Based on these preliminary results, it seems that Gaussian clustering is a useful method, but there is no single panacea for the problem of species delimitation.

Hausdorf B, & Hennig C (2010). Species delimitation using dominant and codominant multilocus markers. Systematic biology, 59 (5), 491-503 PMID: 20693311

3. Selection upon Genome Architecture: Conservation of Functional Neighborhoods with Changing Genes, Fatima Al-Shahrour et al., PLoS Computational Biology, October 2010

The authors compared the genomes of eight Eukaryotes, and observed significant evidence supporting the “functional neighborhood” hypothesis, in which genes of related function tend to cluster together in tight genomic regions.  I think the most exciting result is this:

. . .there is a significantly higher degree of coexpression in genes belonging to a given functional class [as determined by GO terms] when they are packed within a functional neighborhood than when they are elsewhere in the genome. This result, along with the lack of a significant relative enrichment of tandem duplications. . . points to coexpression as the most plausible driving force for the existence of functional neighborhoods.

The paper includes several other interesting points of discussion, including a syntentic analysis between human and chimp genomes. I think this paper could have been stronger if the authors created syntentic maps for all pairwise combinations of the eight species, but I realize this is an ambitious (and perhaps currently impossible?) task.

The Achille’s heal of their analysis is the reliance on GO terms.  Based on these terms, the authors find functional neighborhoods that have been phylogenetically conserved.  For example, the functional neighborhood for coagulation genes has been conserved across all fish (including birds and mammals). This result is very appealing, but I wonder if anyone has investigated the accuracy and/or general usefulness of the GO terms?  If you can suggest good papers, please leave a comment down below.

Al-Shahrour F, Minguez P, Marqués-Bonet T, Gazave E, Navarro A, & Dopazo J (2010). Selection upon genome architecture: conservation of functional neighborhoods with changing genes. PLoS computational biology, 6 (10) PMID: 20949098

Oregon State University Genome Research and Biocomputing 2010 Conference Recap

This will be a quick post of some impressions from yesterday and today’s Center for Genome Research and Biocomputing (CGRB) Fall Conference at Oregon State University.  The speakers were uniformly high quality, with Peter and Rosemary Grant’s talk the consensus of highlight of the program.

Jay Dunlap, “Genetic and Molecular Dissection of a Simple Circadian System”

Dunlap works on the circadian clock in Neurospora, a filamentous fungi. Like the circadian clock in mammals and insects, the basis of the Neurospora clock is a heterodimer that autoregulates the transcription of its own genes.  This autoregulation gives rise to the daily rhythmic expression characteristic of a circadian clock.

Dunlap’s lab has done some hard-core molecular work to dissect the basis of this autoregulation.  This includes chromatin immunoprecipitation assays to demonstrate that methylation at the locus of a key circadian clock regulator is necessary to its proper function, a mutant screen that eventually demonstrated phosphorylation of a protein heterodimer was necessary for proper autoregulation and finally expression microarrays to look for peripheral elements of the circadian clock.  This is one of the better understood molecular genetic pathways, but Dunlap reminded us how much work remains.

My only criticism, which is really more of a difference in philosophies, is Dunlap’s reliance on mutagenesis.  There is natural variation in at least one Neurospora circadian phenotype; such genetic variation has an advantage over lab-induced mutations in that it is maintained by natural selection, whereas mutagenesis studies mostly produce phenotypes of large effect that would never survive in nature.  Natural variation, in a genetically tractable context, can thus be used to understand the molecular genetic basis of a phenotype and the environmental context which maintains it.

Richard Spinrad, “OSU Research Now, Next, and After Next.”

Spinrad discussed the present and future of academic research, with a focus on the need to better communicate science to the general public and to secure future sources of funding.  Most funding (about two thirds of OSU’s research budget by the look of his pie chart) comes from the federal government, while industry and non-profits made up another two percent each.  He discussed the need to make up for the expected federal research budget reductions.  One opportunity Spinrad mentioned is better partnerships with industry.  As corporations reduce in-house R & D, they may outsource it to universities.  There are significant issues with academic-corporate relationships that I won’t get into, but he’s basically right; basic research will need to find other sources of support and corporations will be one of those sources.

My take on his talk:  We’re all aware of the need to communicate our work more broadly and the likelihood of shrinking federal research budgets, but Spinrad didn’t have suggestions for how to address these problems besides attempting to give talks geared to a general audience when we’re traveling and finding ways to increase funding from industry and non-profit sources.  These ideas are both obvious enough to be useless without more detail.  I’m sure he’s been working on such ideas, I would like to have seen his talk include some.

Daniel Schafer, “Some Lessons from the Biometry of Evolution and the Evolution of Biometry.”

This was a, dare I say, entertaining talk about the history of statistics within evolutionary biology.  I only mention it for two points.  The first is that the speaker briefly mentioned the negative binomial-P distribution as a method for RNA-seq data analysis.  Who wants to host him for a potentially dry and very valuable seminar, such as this one?  Secondly, he and the moderator kept on referencing this video.

Peter and Rosemary Grant, “Evolution of Darwin’s Finches” The Roles of Genetics, Ecology and Behavior.”
The Grant’s long term (and ongoing!) data set of darwin finch evolution on Daphne Major is one of the most valuable scientific studies ever conducted.  They have documented evolution of beak size in response to environmental changes for several decades now (more detail available via Google, but a good source is here) and have recently described some of the genes responsible for this variation.  Data sets linking genetic changes to specific ecological changes are nearly impossible to produce.  This is one of the best.

The second part of the talk described incipient speciation driven by a stochastic event (the original PNAS paper is here).  The descendants of a hybrid male immigrant and a hybrid female bred exclusively with each other, producing a lineage with unique beak and body morphology.  The mechanism of this reproductive isolation was a new song, which appears to be the result of imperfect copying of the local species song by the initial hybrid male.  The reproductive isolation is maintained by his descendant’s learning this unique song from their father.  This accident of imperfect imitation leading to reproductive isolation illustrates how important such stochastic events can be to speciation.

As with the last time I saw them, the Grants ended with their thesis:  “To understand the diversity of species we see around us we need to understand the dynamic interactions between genetics, ecology and behavior.”  Their life’s work is the best evidence for this.

Embroidered data: needle & thread not required

We’re all familiar with examples of research misconduct (Marc Hauser being a prominent recent example), but there are plenty of other less deliberate and more insidious ways science can lie to itself.  These include publication bias, choosing a method of statistical analysis that gives the desired answer, etc.  Those are worth discussing, but this post will focus on an informative and (to me anyways) humorous example of embroidered data, which is when a series of misrepresentations of a data set build upon themselves, with the end result and the inferences drawn from it having little connection to reality.  I feel such embroidery happens easily as we filter the literature through our biases and limited abilities of retention.  Most examples may not be as egregious as the following, but they are still failures of science to regulate itself.

THE KAIBAB DEER (this figure is taken from Colvinaux’s 1973 textbook, “Introduction to Ecology.”)
The Kaibab plateau is an area bordering the Grand Canyon that had undergone a series of disturbances from fires, sheep and cattle grazing and finally, predator removal (which occurred after it was designated a park by Teddy Roosevelt).

A subsequent increase in the deer population (Figure 1a) was documented by Rasmussen’s 1941 monograph, “Biotic Communities of the Kaibab Plateau, Arizona.”  The apparent increase was attributed to the removal of top predators.  The solid circles represent the park supervisors’ estimates, the open circles represent those of visitors to the park.  A contemporary wildlife biologist would probably roll his or her eyes at either method, but common sense would suggest that supervisors, who spend far more time in the park, would give more accurate estimates.  At the least, both estimates are represented and their sources noted in Rasmussen’s original paper.

"The Kaibab deer herd fiction; a history of embroidered data. (A) Population estimate of the Kaibab deer herd, copied from Rasmussen (1941). Linked solid circles are the forest supervisor's estimates' circles give estimates of other persons, and the dashed line is Rasmussen's own estimate of the trend. (B) A copy of Leopold's (1943) interpretation of the trend. © A copy of trend given by David Davis and Golley (1963), after Allee et al. (1949), after Leopold (1943) from Rasmussen (1941). (After Caughley, 1970.)

Aldo Leopold (yes, that Aldo Leopold) started the real trouble by basing a publication figure on the curve drawn to fit the visitors’ estimates (Figure 1b).  Two problems should be apparent, 1) the second, and presumably more accurate estimate is ignored and 2) he only reproduced the fitted curve drawn by Rasmussen, which is obviously not an actual best-fit curve as it is drawn to intersect the maximum.   Furthermore, the shape of the curve is altered:  the left-hand side of Leopold’s curve has a sigmoid shape suggestive of a population undergoing logistic growth.  This is what we would expect of a population released from a key restraint.  The right hand side shows a sharp decrease, characteristic of a population that has exceeded its environment’s carrying capacity.  These alterations suggest that the ecological ideas Leopold wished to illustrate biased his interpretation and reproduction of the data.

Finally, Leopold’s modifications were codified in Allee’s 1949 ecology textbook, “Principles of Animal Ecology (click here for the original figure) (Figure 1c).  His comments on the figure thus became accepted fact, while the data they were originally based on is completely obscured.

This data set is still considered a classic example of the control exerted by predators on prey abundance, as Wikipedia demonstrates.

So what can we take from this example of embroidery?  In a narrow sense, predators do not control prey abundance as closely as is commonly thought, as habitat recovery and mitigation of other anthropogenic disturbances probably had a larger effect in the case of the Kaibab deer. (here’s a badly scanned pdf of the chapter I took the figure from if you want more information).

The larger point is obvious: “Look at the data,” to quote my adviser who first showed me this figure.  Science works best when methodology is transparent and a cautious, sound interpretation of the data is suggested.  It also means that we must read the original papers that are the basis of the theory or phenomenon that we’re investigating.  Just about every new grad student has had that point made to them, followed by enough demands to make such historical literature (as arcane and opaque as they often are) the first thing triaged, , but it is necessary if science is to successfully regulate itself.

Evolution 2010, day 4 roundup audiocast

Several authors from this blog are attending Evolution 2010. The conference is huge; with twelve concurrent sessions, it is impossible to see everything. In this audiocast, we discuss several noteworthy lectures from Day 4.

Download the audiocast here:

Evolution 2010, day 4 roundup audiocast (MP3, 22:17, 50.8 Mb)
The discussion panel includes: Victor Hanson-Smith, Paul Cziko, Julian Catchen, Conor O’Brien, Jeremy Yoder, Chris Smith, and Ingo Braasch.

Comments are welcome.