Sample filtering relief algorithm: Robust algorithm for feature selection

Conference proceedings article

Authors/Editors

Strategic Research Themes

No matching items found.

Publication Details

Author list: Saethang T., Prom-On S., Meechai A., Chan J.H.

Publisher: Springer

Publication year: 2009

Volume number: 5507 LNCS

Issue number: PART 2

Start page: 260

End page: 267

Number of pages: 8

ISBN: 3642030394; 9783642030390

ISSN: 0302-9743

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-70349154210&doi=10.1007%2f978-3-642-03040-6_32&partnerID=40&md5=c668af9d7a04e027f9f067d95de64cf6

Languages: English-Great Britain (EN-GB)

View on publisher site

Abstract

Feature selection (FS) plays a crucial role in machine learning to build a robust model for either learning or classification from a large amount of data. Among feature selection techniques, the Relief algorithm is one of the most common due to its simplicity and effectiveness. The performance of the Relief algorithm, however, could be dramatically affected by the consistency of the data patterns. For instance, Relief-F could become less accurate in the presence of noise. The accuracy would decrease further if an outlier sample was included in the dataset. Therefore, it is very important to select the samples to be included in the dataset carefully. This paper presents an effort to improve the effectiveness of Relief algorithm by filtering samples before selecting features. This method is termed Sample Filtering Relief Algorithm (SFRA). The main idea of this method is to discriminate outlier samples out of the main pattern using self organizing map (SOM) and then proceed with feature selection using the Relief algorithm. We have tested SFRA with a gene expression dataset of interferon-α(IFN-α) response of Hepatitis B patients that contains outlier data. SFRA could successfully remove outlier samples that have been verified by visual inspection by experts. Also, it has better accuracy in separating the relevant and irrelevant features than other feature selection methods considered. © 2009 Springer Berlin Heidelberg.

Keywords