Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference

MS Rosenberg, S Kumar

Research output: Contribution to journalArticlepeer-review

70 Scopus citations

Abstract

A major assumption of many molecular phylogenetic methods is the homogeneity of nucleotide frequencies among taxa, which refers to the equality of the nucleotide frequency bias among species. Changes in nucleotide frequency among different lineages in a data set are thought to lead to erroneous phylogenetic inference because unrelated clades may appear similar because of evolutionarily unrelated similarities in nucleotide frequencies. We tested the effects of the heterogeneity of nucleotide frequency bias on phylogenetic inference, along with the interaction between this heterogeneity and stratified taxon sampling, by means of computer simulations using evolutionary parameters derived from genomic databases. We found that the phylogenetic trees inferred from data sets simulated under realistic, observed levels of heterogeneity for mammalian genes were reconstructed with accuracy comparable to those simulated with homogeneous nucleotide frequencies; the results hold for Neighbor-Joining, minimum evolution, maximum parsimony, and maximum-likelihood methods. The LogDet distance method, specifically designed to deal with heterogeneous nucleotide frequencies, does not perform better than distance methods that assume substitution pattern homogeneity among sequences. In these specific simulation conditions, we did not find a significant interaction between phylogenetic accuracy and substitution pattern heterogeneity among lineages, even when the taxon sampling is increased.

Original languageAmerican English
Pages (from-to)610-621
Number of pages12
JournalMolecular Biology and Evolution
Volume20
Issue number4
DOIs
StatePublished - Apr 2003

Keywords

  • LogDet distances
  • Heterogeneous nucleotide composition
  • Nonstationarity
  • Phylogenetic inference
  • Taxon sampling

Fingerprint

Dive into the research topics of 'Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference'. Together they form a unique fingerprint.

Cite this