Thai Named Entity Recognition Using Bi-LSTM-CRF with Word and Character Representation

Conference proceedings article


Authors/Editors


Strategic Research Themes

No matching items found.


Publication Details

Author listThattinaphanich S., Prom-On S.

PublisherHindawi

Publication year2019

Start page149

End page154

Number of pages6

ISBN9781728110196

ISSN0146-9428

eISSN1745-4557

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85076750254&doi=10.1109%2fINCIT.2019.8912091&partnerID=40&md5=3ac28221cbd7d6f87bad32146624f274

LanguagesEnglish-Great Britain (EN-GB)


View in Web of Science | View on publisher site | View citing articles in Web of Science


Abstract

Named Entity Recognition (NER) is a handy tool for many natural language processing tasks to identify and extract a unique entity such as person, location, organization and time. In English and Chinese, NER has been thoroughly researched and is able to be applied in more practical settings. Its development in Thai is still limited because of rare resources and language difficulties such as the lack of boundary indicator for words, phrases and sentences. In this paper, we present an application of Bi-LSTM-CRF with word/character level representation, to solve this problem. Firstly, we prepared texts by tokenizing a sentence to a bunch of words. We then prepared word representation and Bi-LSTM character representation. In the end, we built a recurrent neural network combined with CRF to learn the sequence of text and extract the knowledge to build NER recognition to overcome this problem. Our model was evaluated by the NER opensource corpus from a Facebook group ThaiNLP. The results of our model yielded precision, recall, and F1 at 91.79%, 91.51% and 91.65% respectively. ฉ 2019 IEEE.


Keywords

Bi-LSTMConditional Random Field


Last updated on 2023-25-09 at 07:36