Sequence space and the ongoing expansion of the protein universe

Posted by Victor Hanson-Smith

Check-out this paper by Inna S. Povolotskaya and Fyodor A. Kondrashov(It’s a closed-access Nature article; I’m sorry if you do not have a subscription!)

The premise of this paper begins with two claims.  First, protein-sequence space is finite.  Second, proteins have been evolving away from one other (“expanding in sequence space”) over the last 3.5 billion years.  Given these claims, the authors ask: is it possible that structurally and functionally conserved orthologous proteins from the last universal common ancestor (LUCA) have evolved over a long enough time period such that they reached the limit of their possible sequence divergence?  The authors say apparently not.  For details on how they reach this conclusion, read the paper.

Their result is interesting because it sheds light on the relationship between protein sequence conservation and protein function conservation.  This paper suggests that given enough time two orthologous proteins can evolve apart such that their sequences will contain almost no signal of shared ancestory, but their function will be essentially conserved.  However, this theoretical upper-bound on sequence divergence has not (yet) been reached because proteins evolve slowly across the fitness landscape.

The authors capture this idea in one very compelling paragraph:

The following picture of the protein sequence space emerges from our analysis. Ridges of high fitness corresponding to specific ancient proteins occupy a tiny fraction of the entire volume of the sequence space. However, these ridges are long and thin and can be more accurately visualized as a wide-mesh net spanning a large part of sequence space, rather than as a small volume within the space. Such fitness ridges imply that [epistasis] and compensatory evolution in ancient proteins must be common. Our data show that >90% of the sites in any protein can eventually accept a substitution given the right combination of amino acids at other sites, although it is not clear whether such substitutions are predominantly neutral or beneficial. Regardless of the importance of positive selection in protein divergence, it seems that many sites are conserved because there has not been enough time to create the right combination of amino acids at other sites to allow them to evolve, which may take billions of years.

On a final note, I am not 100% comfortable with the idea that sequence space is finite.  If we momentarily assume that sequence length is finite, then—yes—I agree that sequence space must also be finite.  However, is there an upper-bound on sequence length?  Comments and discussion are welcome.

Povolotskaya, I., & Kondrashov, F. (2010). Sequence space and the ongoing expansion of the protein universe Nature, 465 (7300), 922-926 DOI: 10.1038/nature09105


One response to “Sequence space and the ongoing expansion of the protein universe

  1. Pingback: Tweets that mention Sequence space and the ongoing expansion of the protein universe « Evolution, Development, and Genomics --

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s