Biomarker Identification in Colorectal Cancer Using Subnetwork Analysis with Feature Selection

Conference proceedings article


Authors/Editors


Strategic Research Themes


Publication Details

Author listKozuevanich S., Meechai A., Chan J.H.

PublisherSpringer

Publication year2020

Volume number1149 AISC

Start page119

End page127

Number of pages9

ISBN9783030440435

ISSN2194-5357

eISSN2194-5357

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85083645970&doi=10.1007%2f978-3-030-44044-2_12&partnerID=40&md5=9943a5380c47d534e56e5c44114a9e24

LanguagesEnglish-Great Britain (EN-GB)


View on publisher site


Abstract

Gene Sub-Network-based Feature Selection (GSNFS) is an efficient method for handling case-control and multiclass studies for gene sub-network biomarker identification by an integrated analysis of gene expression, gene-set and network data. However, GSNFS has produce considerably high number of sub-network and has not assessed the importance of each sub-network. Recently, we have incorporated 2 feature selection techniques; correlation-based and information gain into the GSNFS workflow to help reduce the number and assess the importance of each individual sub-network. The extended GSNFS method was clearly shown to identify good candidate gene subnetwork markers in lung cancer. In this work, we applied a similar work flow to colorectal cancer. First, the top- and bottom- 5 ranked gene-sets were selected and investigated the classification performance. Similarly, the top-ranked gene-sets showed a better performance than the bottom-ranked gene-sets. The identified top-ranked gene-sets such as TNF-beta and MAPK signaling pathway were known to relate to cancer. In addition, the characteristic of top identified pathway network was further analyzed and visualized. SMAD3, a gene that was reported to be related to cancer by many studies, was mostly found to have the highest neighbor in 4 datasets. The results in this study has confirmed that GSNFS combined with feature selection is very promising as significantly fewer subnetworks were needed to build a classifier and gave a comparable performance to a full dataset classifier. © 2020, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG.


Keywords

colorectal cancerCorrelation-based feature selectiongene expression analysisgene-setInformation Gain feature selection


Last updated on 2025-30-01 at 12:00