Feature selection in GSNFS-based marker identification

Conference proceedings article


ผู้เขียน/บรรณาธิการ


กลุ่มสาขาการวิจัยเชิงกลยุทธ์


รายละเอียดสำหรับงานพิมพ์

รายชื่อผู้แต่งSivakorn Kozuevanich, Jonathan H Chan, Asawin Meechai

ปีที่เผยแพร่ (ค.ศ.)2019

ชื่อชุดCSBio '19: Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics

URLhttps://dl.acm.org/doi/10.1145/3365953.3365964


ดูบนเว็บไซต์ของสำนักพิมพ์


บทคัดย่อ

Gene Sub-Network-based Feature Selection (GSNFS) is a method capable of handling case-control and multiclass studies for gene sub-network biomarker identification by an integrated analysis of gene expression, gene-set and network data. It has previously been shown to reasonably identify sub-network markers for lung cancer. However, previous studies have not assessed the importance of each subnetwork identified by GSNFS. In this work, we applied correlation-based and information gain feature selection techniques to rank the identified sub-network biomarkers (gene-set). First, the top- and bottom- 5 ranked gene-sets were selected and investigated the classification performance. Expectedly, the top-ranked gene-sets provided an excellent performance while the bottom-ranked gene-sets showed a poor performance. The identified top-ranked gene-sets such as MAPK signalling pathway were known to relate to cancer. Furthermore, combined top-ranked gene-sets from top 2 up to top 30 showed a further improvement on the performance when compared to using individual gene-sets. The results in this study are promising as significantly fewer subnetworks were needed to build a classifier and gave a comparable performance to a full data-set classifier.


คำสำคัญ

ไม่พบข้อมูลที่เกี่ยวข้อง


อัพเดทล่าสุด 2024-23-02 ถึง 23:05