A comparative study of multiclass feature selection on RNAseq and microarray data

Silu Zhang, Junqing Wang, Keli Xu, Megan M. York, Mo Yinyuan, Yixin Chen, Yunyun Zhou

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Gene expression profiles are widely used for identifying phenotype-specific biomarkers in clinical cancer research. By examining important genes expressed in different phenotypes, patients can be classified into different treatment groups. Microarray and RNAseq are the two leading technologies to measure gene expression data. However, due to the heterogeneity of the two platforms, their selected genes are different. In this project, we systematically compared the breast cancer subtype classification accuracies from the selected genes by four popular multiclass feature selection algorithms and discussed the strengths and weakness of selected genes across different platforms and cohorts. Our results showed that the classification of selected genes performs best within the same platform across different cohorts. It suggested that merging the dataset belonging to the same platform will increase the statistical power and improve the prediction accuracy of the selected gene for multiclass classification analysis.

Original languageEnglish
Pages (from-to)128-142
Number of pages15
JournalInternational Journal of Computational Biology and Drug Design
Volume12
Issue number2
DOIs
StatePublished - 2019
Externally publishedYes

Keywords

  • Breast cancer
  • Cancer subtypes
  • Feature selection
  • Functional analysis
  • Integration analysis
  • Machine learning
  • Systems biology

Fingerprint

Dive into the research topics of 'A comparative study of multiclass feature selection on RNAseq and microarray data'. Together they form a unique fingerprint.

Cite this