Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites

Research output: Contribution to journalArticlepeer-review

134 Scopus citations

Abstract

We propose two approximate methods (one based on parsimony and one on pairwise sequence comparison) for estimating the pattern of nucleotide substitution and a parsimony-based method for estimating the gamma parameter for variable substitution rates among sites. The matrix of substitution rates that represents the substitution pattern can be recovered through its relationship with the observable matrix of site pattern frequences in pairwise sequence comparisons. In the parsimony approach, the ancestral sequences reconstructed by the parsimony algorithm were used, and the two sequences compared are those at the ends of a branch in the phylogenetic tree. The method for estimating the gamma parameter was based on a reinterpretation of the numbers of changes at sites inferred by parsimony. Three data sets were analyzed to examine the utility of the approximate methods compared with the more reliable likelihood methods. The new methods for estimating the substitution pattern were found to produce estimates quite similar to those obtained from the likelihood analyses. The new method for estimating the gamma parameter was effective in reducing the bias in conventional parsimony estimates, although it also overestimated the parameter. The approximate methods are computationally very fast and appear useful for analyzing large data sets, for which use of the likelihood method requires excessive computation.

Original languageEnglish
Pages (from-to)650-659
Number of pages10
JournalMolecular Biology and Evolution
Volume13
Issue number5
DOIs
StatePublished - May 1996

Keywords

  • Markov models
  • generalized sequence distance
  • likelihood
  • molecular evolution
  • pairwise comparison
  • parsimony
  • rate variation among sites
  • reversibility
  • substitution pattern

Fingerprint

Dive into the research topics of 'Approximate methods for estimating the pattern of nucleotide substitution and the variation of substitution rates among sites'. Together they form a unique fingerprint.

Cite this