Satja: Thai elderly speech corpus for speech recognition
Conference proceedings article
Authors/Editors
Strategic Research Themes
No matching items found.
Publication Details
Author list: Prajongjai S., Triyason T., Mongkolnam P.
Publisher: Hindawi
Publication year: 2018
ISBN: 9781450365680
ISSN: 0146-9428
eISSN: 1745-4557
Languages: English-Great Britain (EN-GB)
View in Web of Science | View on publisher site | View citing articles in Web of Science
Abstract
Thai language is the official language of Thailand. At present, about 70 million speakers are located in Thailand and the southern parts of China, Yunnan, Guizhou, and Guangxi. The Thai language is a tonal language. Thai Language is a challenging language for speech processing technology. Because the Thai spoken language database is limited and also lacks a specific speech corpus, such as a children's speech database, elderly speech, accents spoken in each region, etc. This research develops the Thai elderly speech named Satja meaning is truth of speech. The content of this corpus is a voice command. There are 50 speakers, 24 males and 26 females, covering six regions in Thailand, aged 60-85 years. In addition, the database of elderly voice was compared to non-elderly voice. For a model training, we used CMUSphinx and tested with Sphinx4. We found that when the elderly speech was tested with the elderly model, it was more accurate when experimented than the model trained by the non-elderly people. ฉ 2018 Association for Computing Machinery.
Keywords
Speech corpus development, Speech recognition system