Modeling a generic web classification system using design patterns

Journal article


Authors/Editors


Strategic Research Themes

No matching items found.


Publication Details

Author listSukakanya U., Porkaew K.

Publication year2011

Volume number6

Issue number10

Start page2212

End page2220

Number of pages9

ISSN1796-203X

eISSN1796-203X

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-80053328158&doi=10.4304%2fjcp.6.10.2212-2220&partnerID=40&md5=e6848928351836ba0d3793dbf4036d90

LanguagesEnglish-Great Britain (EN-GB)


View on publisher site


Abstract

In order to save time in extracting specific information from high volume of data in web documents, this paper proposes an architectural model of generic web document classification system using design patterns for classifying web documents. This work implements two classification techniques for classifying Thai web documents, namely centroid classification and neural network classification, based on the proposed model and compares their classification effectiveness empirically. The training data sets in this experiment consist of 500 web documents of the following five categories (100 documents for each category): mobile phone sales, book sales, travel sales, education information and company profile. Another two hundred and fifty web documents were then used to test the two classifiers. The experiment results showed that the centroid classifier outperforms the neural network classifier both in term of efficiency and effectiveness. ฉ 2011 ACADEMY PUBLISHER.


Keywords

CentroidDocument analyzertext classificationWeb classification modeling


Last updated on 2023-23-09 at 07:36