TY - JOUR
T1 - Sequence comparison and protein structure prediction
AU - Dunbrack, Roland L.
PY - 2006/6
Y1 - 2006/6
N2 - Sequence comparison is a major step in the prediction of protein structure from existing templates in the Protein Data Bank. The identification of potentially remote homologues to be used as templates for modeling target sequences of unknown structure and their accurate alignment remain challenges, despite many years of study. The most recent advances have been in combining as many sources of information as possible - including amino acid variation in the form of profiles or hidden Markov models for both the target and template families, known and predicted secondary structures of the template and target, respectively, the combination of structure alignment for distant homologues and sequence alignment for close homologues to build better profiles, and the anchoring of certain regions of the alignment based on existing biological data. Newer technologies have been applied to the problem, including the use of support vector machines to tackle the fold classification problem for a target sequence and the alignment of hidden Markov models. Finally, using the consensus of many fold recognition methods, whether based on profile-profile alignments, threading or other approaches, continues to be one of the most successful strategies for both recognition and alignment of remote homologues. Although there is still room for improvement in identification and alignment methods, additional progress may come from model building and refinement methods that can compensate for large structural changes between remotely related targets and templates, as well as for regions of misalignment.
AB - Sequence comparison is a major step in the prediction of protein structure from existing templates in the Protein Data Bank. The identification of potentially remote homologues to be used as templates for modeling target sequences of unknown structure and their accurate alignment remain challenges, despite many years of study. The most recent advances have been in combining as many sources of information as possible - including amino acid variation in the form of profiles or hidden Markov models for both the target and template families, known and predicted secondary structures of the template and target, respectively, the combination of structure alignment for distant homologues and sequence alignment for close homologues to build better profiles, and the anchoring of certain regions of the alignment based on existing biological data. Newer technologies have been applied to the problem, including the use of support vector machines to tackle the fold classification problem for a target sequence and the alignment of hidden Markov models. Finally, using the consensus of many fold recognition methods, whether based on profile-profile alignments, threading or other approaches, continues to be one of the most successful strategies for both recognition and alignment of remote homologues. Although there is still room for improvement in identification and alignment methods, additional progress may come from model building and refinement methods that can compensate for large structural changes between remotely related targets and templates, as well as for regions of misalignment.
KW - Computational Biology/methods
KW - Markov Chains
KW - Models, Molecular
KW - Protein Conformation
KW - Protein Folding
KW - Sequence Alignment/methods
KW - Sequence Analysis, Protein/methods
KW - Sequence Homology, Amino Acid
KW - Software
KW - Structural Homology, Protein
UR - http://www.scopus.com/inward/record.url?scp=33744779891&partnerID=8YFLogxK
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=purepublist2023&SrcAuth=WosAPI&KeyUT=WOS:000239082100015&DestLinkType=FullRecord&DestApp=WOS
U2 - 10.1016/j.sbi.2006.05.006
DO - 10.1016/j.sbi.2006.05.006
M3 - Review article
C2 - 16713709
SN - 0959-440X
VL - 16
SP - 374
EP - 384
JO - Current Opinion in Structural Biology
JF - Current Opinion in Structural Biology
IS - 3
ER -