Thai person name recognition (PNR) using likelihood probability of tokenized words

Conference proceedings article


Authors/Editors


Strategic Research Themes

No matching items found.


Publication Details

Author listSaetiew N., Achalakul T., Prom-On S.

PublisherHindawi

Publication year2017

ISBN9781509046669

ISSN0146-9428

eISSN1745-4557

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85039955757&doi=10.1109%2fIEECON.2017.8075816&partnerID=40&md5=9050bf6e0191d5df342070936a1204b2

LanguagesEnglish-Great Britain (EN-GB)


View on publisher site


Abstract

Named Entity Recognition (NER) is very important in many natural language processing tasks, especially information extraction. The problem of NE extraction in Thai is much more complicated than English because Thai language lacks orthography and boundary indicator between words. In this paper, we presented a research work in the field of NER with the emphasis on person name recognition (PNR) in Thai text. Our proposed method consists of 4 steps. First, text is tokenized into a set of words. Second, a part-of-name probability is computed for each word using Odds with Laplace smoothing and Logistic function. Third, name candidates are selected based on the likelihood probability. Finally, the end point of name is identified using a set of rules and a drop rate threshold. We then evaluated out method using 1,700 online news from the InterBEST 2009 corpus. The results show that the proposed method yields average precision, recall, f-measure and accuracy at 75.21%, 98.10%, 85.15%, and 81.05% respectively. ฉ 2017 IEEE.


Keywords

Laplace smoothingOddsPerson name recognitionThai text analytics


Last updated on 2023-03-10 at 07:36