Quantifying the evolutionary divergence of protein structures: The role of function change and function conservation

Alberto Pascual-García

CSIC and Universidad Autonoma de Madrid , Centro de Biología Molecular, Madrid, Spain

Quantifying the evolutionary divergence of protein structures: The role of function change and function conservation. Talk Abstract: The molecular clock hypothesis, stating that protein sequences accumulate amino acid substitutions at an almost constant rate during evolution, played a major role in the development of molecular evolution, boosting quantitative theories of evolutionary change such as Kimura¢s and Ohta's neutral and nearly neutral theory. In the context of protein structures. the 1980 paper by Chothia and Lesk was a milestone in the study of the relationship between protein structure and protein sequence divergence, using globins as a test case. Here we analyze the relationship between sequence, structure, function and length divergence for four large superfamilies of evolutionarily related proteins: Globins, Aldolases, P-loop containing nucleoside triphosphate hydrolases and NADP-binding Rossmanfold. We introduce a novel measure of protein structure divergence, the contact divergence, which is motivated by the analogy with sequence divergence in evolution. This measure is more consistent with sequence divergence than previously used measures. For all four superfamilies we find proportionality between structure and sequence divergence, consistent with the molecular clock, up to a cross-over sequence identity at which an explosion of structural diversity is observed. Our results suggest that this explosion is related to functional diversification. In fact, proteins sharing the same function evolve only up to a limited value of sequence and in particular structure divergence, suggesting that functional constraints act on the global protein structure, whereas proteins with different functions diverge in structure at a significantly faster rate. Moreover, large insertions and deletions are almost always associated to function changes. These functional constraints on protein sequence and structure allow to predict protein function within a given superfamily with surprisingly high accuracy. The clock-like evolution of protein structures was also tested by assessing the consistency between structure similarity networks and phylogenetic trees through the clustering coefficient. We found clustering coefficient close to one at high similarity, for which intrinsic clusters can be reconstructed using either structure information or structure based sequence alignments. These clusters are almost completely homogeneous in function. The clustering coefficient drops at smaller similarity, suggesting that the evolutionary rate accelerates on some branches of the tree. We conjecture that this acceleration is related to positive selection for new functions.

Back