TY - JOUR
T1 - Testing computational prediction of missense mutation phenotypes
T2 - Functional characterization of 204 mutations of human cystathionine beta synthase
AU - Wei, Qiong
AU - Wang, Liqun
AU - Wang, Qiang
AU - Kruger, Warren D.
AU - Dunbrack, Roland L.
PY - 2010/7
Y1 - 2010/7
N2 - Predicting the phenotypes of missense mutations uncovered by large-scale sequencing projects is an important goal in computational biology. High-confidence predictions can be an aid in focusing experimental and association studies on those mutations most likely to be associated with causative relationships between mutation and disease. As an aid in developing these methods further, we have derived a set of random mutations of the enzymatic domains of human cystathionine beta synthase. This enzyme is a dimeric protein that catalyzes the condensation of serine and homocysteine to produce cystathionine. Yeast missing this enzyme cannot grow on medium lacking a source of cysteine, while transfection of functional human CBS into yeast strains missing endogenous enzyme can successfully complement for the missing gene. We used PCR mutagenesis with error-prone Taq polymerase to produce 948 colonies and compared cell growth in the presence or absence of a cysteine source as a measure of CBS function. We were able to Infer the phenotypes of 204 singlesite mutants, 79 of them deleterious and 123 neutral. This set was used to test the accuracy of six publicly available prediction methods for phenotvpe prediction of missense mutations: SIFT, PolyPhen, PMut, SNPs3D, PhD-SNP, and nsSNPAnalyzer. The top methods are PolyPhen, SIFT, and nsSNPAnalyzer, which have similar performance. Using kernel discriminant functions, we found that the difference in position-specific scoring matrix values is more predictive than the wild-type PSSM score alone, and that the relative surface area in the biologically relevant complex is more predictive than that of the monomeric proteins.
AB - Predicting the phenotypes of missense mutations uncovered by large-scale sequencing projects is an important goal in computational biology. High-confidence predictions can be an aid in focusing experimental and association studies on those mutations most likely to be associated with causative relationships between mutation and disease. As an aid in developing these methods further, we have derived a set of random mutations of the enzymatic domains of human cystathionine beta synthase. This enzyme is a dimeric protein that catalyzes the condensation of serine and homocysteine to produce cystathionine. Yeast missing this enzyme cannot grow on medium lacking a source of cysteine, while transfection of functional human CBS into yeast strains missing endogenous enzyme can successfully complement for the missing gene. We used PCR mutagenesis with error-prone Taq polymerase to produce 948 colonies and compared cell growth in the presence or absence of a cysteine source as a measure of CBS function. We were able to Infer the phenotypes of 204 singlesite mutants, 79 of them deleterious and 123 neutral. This set was used to test the accuracy of six publicly available prediction methods for phenotvpe prediction of missense mutations: SIFT, PolyPhen, PMut, SNPs3D, PhD-SNP, and nsSNPAnalyzer. The top methods are PolyPhen, SIFT, and nsSNPAnalyzer, which have similar performance. Using kernel discriminant functions, we found that the difference in position-specific scoring matrix values is more predictive than the wild-type PSSM score alone, and that the relative surface area in the biologically relevant complex is more predictive than that of the monomeric proteins.
KW - Cystathionine beta synthase
KW - Mutations
KW - Phenotype prediction
UR - http://www.scopus.com/inward/record.url?scp=77953602809&partnerID=8YFLogxK
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=purepublist2023&SrcAuth=WosAPI&KeyUT=WOS:000278278600005&DestLinkType=FullRecord&DestApp=WOS
U2 - 10.1002/prot.22722
DO - 10.1002/prot.22722
M3 - Article
C2 - 20455263
SN - 0887-3585
VL - 78
SP - 2058
EP - 2074
JO - Proteins: Structure, Function and Bioinformatics
JF - Proteins: Structure, Function and Bioinformatics
IS - 9
ER -