Abstract
Gene expression profiles are widely used for identifying phenotype-specific biomarkers in clinical cancer research. By examining important genes expressed in different phenotypes, patients can be classified into different treatment groups. Microarray and RNAseq are the two leading technologies to measure gene expression data. However, due to the heterogeneity of the two platforms, their selected genes are different. In this project, we systematically compared the breast cancer subtype classification accuracies from the selected genes by four popular multiclass feature selection algorithms and discussed the strengths and weakness of selected genes across different platforms and cohorts. Our results showed that the classification of selected genes performs best within the same platform across different cohorts. It suggested that merging the dataset belonging to the same platform will increase the statistical power and improve the prediction accuracy of the selected gene for multiclass classification analysis.
| Original language | English |
|---|---|
| Pages (from-to) | 128-142 |
| Number of pages | 15 |
| Journal | International Journal of Computational Biology and Drug Design |
| Volume | 12 |
| Issue number | 2 |
| DOIs | |
| State | Published - 2019 |
| Externally published | Yes |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Breast cancer
- Cancer subtypes
- Feature selection
- Functional analysis
- Integration analysis
- Machine learning
- Systems biology
Fingerprint
Dive into the research topics of 'A comparative study of multiclass feature selection on RNAseq and microarray data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver