Feature selection in GSNFS-based marker identification

Conference proceedings article


Authors/Editors


Strategic Research Themes


Publication Details

Author listSivakorn Kozuevanich, Jonathan H Chan, Asawin Meechai

Publication year2019

Title of seriesCSBio '19: Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics

URLhttps://dl.acm.org/doi/10.1145/3365953.3365964


View on publisher site


Abstract

Gene Sub-Network-based Feature Selection (GSNFS) is a method capable of handling case-control and multiclass studies for gene sub-network biomarker identification by an integrated analysis of gene expression, gene-set and network data. It has previously been shown to reasonably identify sub-network markers for lung cancer. However, previous studies have not assessed the importance of each subnetwork identified by GSNFS. In this work, we applied correlation-based and information gain feature selection techniques to rank the identified sub-network biomarkers (gene-set). First, the top- and bottom- 5 ranked gene-sets were selected and investigated the classification performance. Expectedly, the top-ranked gene-sets provided an excellent performance while the bottom-ranked gene-sets showed a poor performance. The identified top-ranked gene-sets such as MAPK signalling pathway were known to relate to cancer. Furthermore, combined top-ranked gene-sets from top 2 up to top 30 showed a further improvement on the performance when compared to using individual gene-sets. The results in this study are promising as significantly fewer subnetworks were needed to build a classifier and gave a comparable performance to a full data-set classifier.


Keywords

No matching items found.


Last updated on 2024-23-02 at 23:05