High quality protein sequence alignment by combining structural profile prediction
and profile alignment using SABERTOOTH

Jonas Minning

Technical University Darmstadt, Institut für Festkörperphysik, Darmstadt, Germany

Protein alignments are an essential ingredient of many bioinformatics analyses, but their quality is limited when structure data is not available and alignments have to be computed from the protein sequence. The expressiveness of sequence alignments that are based on optimizing a measure of sequence similarity is inherently restricted to relatively closely related sequences. With increasing distance in sequence space, sequence similarity measures become indistinguishable from random, yielding misleading alignments of low quality, even though the respective protein structures might be very similar.

We perform sequence alignments by combining structural profile prediction from sequence with subsequent profile alignment using our recently developed alignment tool SABERTOOTH. In particular we predict the residue contact vector of protein structure, utilizing position specific scoring matrices output by PSIBLAST as input data to an artificial neural network. In a comprehensive comparison with the established sequence aligners ClustalW, MUSCLE, TCoffee, and PSIBLAST we demonstrate that SABERTOOTH produces sequence alignments that are of superior quality, as assessed with objective measures based on structural superposition. Furthermore, we show that the significance score defined here performs best in the recognition of evolutionary and structural relationships.

Back