Study of discretization methods in classification

Conference proceedings article

Authors/Editors

KITTICHAI LAVANGNANANDA

Strategic Research Themes

No matching items found.

Publication Details

Author list: Lavangnananda K., Chattanachot S.

Publisher: Hindawi

Publication year: 2017

Start page: 50

End page: 55

Number of pages: 6

ISBN: 9781467390774

ISSN: 0146-9428

eISSN: 1745-4557

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85017497703&doi=10.1109%2fKST.2017.7886082&partnerID=40&md5=5a71841963ef7c334de3bed5f2253f1c

Languages: English-Great Britain (EN-GB)

View on publisher site

Abstract

Classification is one of the important tasks in Data Mining or Knowledge Discovery with prolific applications. Satisfactory classification depends on characteristics of the dataset too. Numerical and nominal attributes are commonly occurred in the dataset. However, classification performance may be aided by discretization of numerical attributes. At present, several discretization methods and numerous techniques for implementing classifiers exist. This study has three main objectives. First is to study the effectiveness of discretization of attributes, and second is to compare the efficiency of eight discretization methods. These are ChiMerge, Chi2, Modified Chi2, Extended Chi2, Class-Attribute Interdependence Maximization (CAIM), Class-Attribute Contingency Coefficient (CACC), Autonomous Discretization Algorithm (Ameva), and Minimum Description Length Principle (MDLP). Finally, the study investigates suitability of the eight discretization methods when applied to the five commonly known classifiers, Neural Network, K Nearest Neighbour (K-NN), Naive Bayes, C4.5, and Support Vector machine (SVM). ฉ 2017 IEEE.

Keywords

Ameva, C4.5, CACC, CAIM, Chi2, ChiMerge, Extended Chi2, K-Nearest Neighbour, MDLP, Modified Chi2, Naive Bayes