Feature selection in GSNFS-based marker identification
Conference proceedings article
ผู้เขียน/บรรณาธิการ
กลุ่มสาขาการวิจัยเชิงกลยุทธ์
รายละเอียดสำหรับงานพิมพ์
รายชื่อผู้แต่ง: Sivakorn Kozuevanich, Jonathan H Chan, Asawin Meechai
ปีที่เผยแพร่ (ค.ศ.): 2019
ชื่อชุด: CSBio '19: Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics
URL: https://dl.acm.org/doi/10.1145/3365953.3365964
บทคัดย่อ
Gene Sub-Network-based Feature Selection (GSNFS) is a method capable of handling case-control and multiclass studies for gene sub-network biomarker identification by an integrated analysis of gene expression, gene-set and network data. It has previously been shown to reasonably identify sub-network markers for lung cancer. However, previous studies have not assessed the importance of each subnetwork identified by GSNFS. In this work, we applied correlation-based and information gain feature selection techniques to rank the identified sub-network biomarkers (gene-set). First, the top- and bottom- 5 ranked gene-sets were selected and investigated the classification performance. Expectedly, the top-ranked gene-sets provided an excellent performance while the bottom-ranked gene-sets showed a poor performance. The identified top-ranked gene-sets such as MAPK signalling pathway were known to relate to cancer. Furthermore, combined top-ranked gene-sets from top 2 up to top 30 showed a further improvement on the performance when compared to using individual gene-sets. The results in this study are promising as significantly fewer subnetworks were needed to build a classifier and gave a comparable performance to a full data-set classifier.
คำสำคัญ
ไม่พบข้อมูลที่เกี่ยวข้อง