A Cluster Based Classification of Imbalanced Data with Overlapping Regions Between Classes

Conference proceedings article

ผู้เขียน/บรรณาธิการ

ภาสพิชญ์ ชูใจ มิเชล

กลุ่มสาขาการวิจัยเชิงกลยุทธ์

ไม่พบข้อมูลที่เกี่ยวข้อง

รายละเอียดสำหรับงานพิมพ์

รายชื่อผู้แต่ง: Chujai P., Choomboon K., Chaiyakhan K., Kerdprasop K., Kerdprasop N.

ปีที่เผยแพร่ (ค.ศ.): 2017

Volume number: 2227

หน้าแรก: 353

หน้าสุดท้าย: 358

จำนวนหน้า: 6

ISBN: 9789881404732

นอก: 2078-0958

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85042177392&partnerID=40&md5=f7323d83bcd276e1d9b2542a3f1800b5

ภาษา: English-Great Britain (EN-GB)

บทคัดย่อ

Classifying imbalanced data is a significant challenge for machine learning algorithms. Difficulty is due to the fact that data in the minority class can easily be overshadowed by the much larger number of instances in the majority class. The overall classification accuracy may be high, but the recognition of data instances in the minority class are normally unacceptable when applying standard algorithms. Therefore, this research proposes a technique for handling the imbalanced classification problem. We solve the imbalanced classification problem by performing separation of the imbalanced data into overlapped and non-overlapped regions between majority and minority classes. After the separation, data were clustered based on Euclidean distance consideration. Each cluster, then, has its own classification model. To predict the future event, closest distance scheme from all models has been applied. The experimental results show that the proposed technique modeling with the SVM using linear kernel function yields the best performance in classifying minority data.

คำสำคัญ

Imbalanced data classification, Overlapping region, SVM with linear kernel