Heterogeneous ensemble approach with discriminative features and modified-SMOTEbagging for pre-miRNA classification
บทความในวารสาร
ผู้เขียน/บรรณาธิการ
กลุ่มสาขาการวิจัยเชิงกลยุทธ์
ไม่พบข้อมูลที่เกี่ยวข้อง
รายละเอียดสำหรับงานพิมพ์
รายชื่อผู้แต่ง: Lertampaiporn S., Thammarongtham C., Nukoolkit C., Kaewkamnerdpong B., Ruengjitchatchawalya M.
ผู้เผยแพร่: Oxford University Press
ปีที่เผยแพร่ (ค.ศ.): 2013
วารสาร: Nucleic Acids Research (0305-1048)
Volume number: 41
Issue number: 1
นอก: 0305-1048
eISSN: 1362-4962
ภาษา: English-Great Britain (EN-GB)
ดูในเว็บของวิทยาศาสตร์ | ดูบนเว็บไซต์ของสำนักพิมพ์ | บทความในเว็บของวิทยาศาสตร์
บทคัดย่อ
An ensemble classifier approach for microRNA precursor (pre-miRNA) classification was proposed based upon combining a set of heterogeneous algorithms including support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF), then aggregating their prediction through a voting system. Additionally, the proposed algorithm, the classification performance was also improved using discriminative features, self-containment and its derivatives, which have shown unique structural robustness characteristics of pre-miRNAs. These are applicable across different species. By applying preprocessing methods - both a correlation-based feature selection (CFS) with genetic algorithm (GA) search method and a modified-Synthetic Minority Oversampling Technique (SMOTE) bagging rebalancing method - improvement in the performance of this ensemble was observed. The overall prediction accuracies obtained via 10 runs of 5-fold cross validation (CV) was 96.54%, with sensitivity of 94.8% and specificity of 98.3% - this is better in trade-off sensitivity and specificity values than those of other state-of-the-art methods. The ensemble model was applied to animal, plant and virus pre-miRNA and achieved high accuracy, >93%. Exploiting the discriminative set of selected features also suggests that pre-miRNAs possess high intrinsic structural robustness as compared with other stem loops. Our heterogeneous ensemble method gave a relatively more reliable prediction than those using single classifiers. Our program is available at http://ncrna-pred.com/premiRNA.html. ฉ 2012 The Author(s). Published by Oxford University Press.
คำสำคัญ
ไม่พบข้อมูลที่เกี่ยวข้อง