Abstract
Computational prediction of the phenotypic propensities of noncoding single nucleotide variants typically combines annotation of genomic, functional and evolutionary attributes into a single score. Here, we evaluate if the claimed excellent accuracies of these predictions translate into high rates of success in addressing questions important in biological research, such as fine mapping causal variants, distinguishing pathogenic allele(s) at a given position, and prioritizing variants for genetic risk assessment. A significant disconnect is found to exist between the statistical modelling and biological performance of predictive approaches. We discuss fundamental reasons underlying these deficiencies and suggest that future improvements of computational predictions need to address confounding of allelic, positional and regional effects as well as imbalance of the proportion of true positive variants in candidate lists.
Original language | English |
---|---|
Article number | 330 |
Pages (from-to) | 330 |
Journal | Nature Communications |
Volume | 10 |
Issue number | 1 |
DOIs | |
State | Published - Jan 19 2019 |
Keywords
- Algorithms//Animals//Computational Biology//Disease/*genetics//Evolution, Molecular//Genome-Wide Association Study//Humans//Machine Learning//Mammals/genetics//*Models, Statistical//Polymorphism, Single Nucleotide//RNA, Untranslated/*genetics