Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data

Sayaka Miura, Tracy Vu, Jiamin Deng, Tiffany Buturla, Olumide Oladeinde, Jiyeong Choi, Sudhir Kumar

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

Tumors harbor extensive genetic heterogeneity in the form of distinct clone genotypes that arise over time and across different tissues and regions in cancer. Many computational methods produce clone phylogenies from population bulk sequencing data collected from multiple tumor samples from a patient. These clone phylogenies are used to infer mutation order and clone origins during tumor progression, rendering the selection of the appropriate clonal deconvolution method critical. Surprisingly, absolute and relative accuracies of these methods in correctly inferring clone phylogenies are yet to consistently assessed. Therefore, we evaluated the performance of seven computational methods. The accuracy of the reconstructed mutation order and inferred clone groupings varied extensively among methods. All the tested methods showed limited ability to identify ancestral clone sequences present in tumor samples correctly. The presence of copy number alterations, the occurrence of multiple seeding events among tumor sites during metastatic tumor evolution, and extensive intermixture of cancer cells among tumors hindered the detection of clones and the inference of clone phylogenies for all methods tested. Overall, CloneFinder, MACHINA, and LICHeE showed the highest overall accuracy, but none of the methods performed well for all simulated datasets. So, we present guidelines for selecting methods for data analysis.

Original languageEnglish
Article number3498
JournalScientific Reports
Volume10
Issue number1
DOIs
StatePublished - Dec 1 2020
Externally publishedYes

Fingerprint

Dive into the research topics of 'Power and pitfalls of computational methods for inferring clone phylogenies and mutation orders from bulk sequencing data'. Together they form a unique fingerprint.

Cite this