Semi-automatic construction of thyroid cancer intervention corpus from biomedical abstracts
Conference proceedings article
Authors/Editors
Strategic Research Themes
No matching items found.
Publication Details
Author list: Kongburan W., Padungweang P., Krathu W., Chan J.H.
Publisher: Hindawi
Publication year: 2016
Start page: 150
End page: 157
Number of pages: 8
ISBN: 9781467377829
ISSN: 0146-9428
eISSN: 1745-4557
Languages: English-Great Britain (EN-GB)
Abstract
Thyroid cancer is a common endocrine tumor that is experiencing a steady increase in incidence worldwide. The latest discoveries on disease and its treatment are mostly propagated in the form of biomedical publications such as those in PubMed. Unfortunately, this information is distributed in unstructured text with over two thousand articles being added annually. Text mining technology plays an important role in information extraction, since it can be used to uncover hidden value from the vast amount of text in reasonable time. In general, a preliminary task of text mining is Named Entity Recognition (NER). In this case, a gold standard corpus is needed, since the capability of NER depends on a trustworthy corpus. However the construction of gold standard corpus is a laborious and time-consuming process. In order to obtain a reasonably practical corpus in a limited time, this paper consequently proposes a semiautomatic approach to construct a thyroid cancer interventions corpus. The experimental results demonstrate that the proposed method can be used to construct a thyroid cancer intervention corpus reasonably in terms of both performance and overfitting avoidance. ฉ 2016 IEEE.
Keywords
Corpus, Intervention, thyroid cancer