Satja: Thai elderly speech corpus for speech recognition

Conference proceedings article

Authors/Editors

Strategic Research Themes

No matching items found.

Publication Details

Author list: Prajongjai S., Triyason T., Mongkolnam P.

Publisher: Hindawi

Publication year: 2018

ISBN: 9781450365680

ISSN: 0146-9428

eISSN: 1745-4557

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85059878509&doi=10.1145%2f3291280.3291793&partnerID=40&md5=88b3f1ad36f8a6bf08f2bb18c257505b

Languages: English-Great Britain (EN-GB)

View in Web of Science | View on publisher site | View citing articles in Web of Science

Abstract

Thai language is the official language of Thailand. At present, about 70 million speakers are located in Thailand and the southern parts of China, Yunnan, Guizhou, and Guangxi. The Thai language is a tonal language. Thai Language is a challenging language for speech processing technology. Because the Thai spoken language database is limited and also lacks a specific speech corpus, such as a children's speech database, elderly speech, accents spoken in each region, etc. This research develops the Thai elderly speech named Satja meaning is truth of speech. The content of this corpus is a voice command. There are 50 speakers, 24 males and 26 females, covering six regions in Thailand, aged 60-85 years. In addition, the database of elderly voice was compared to non-elderly voice. For a model training, we used CMUSphinx and tested with Sphinx4. We found that when the elderly speech was tested with the elderly model, it was more accurate when experimented than the model trained by the non-elderly people. ฉ 2018 Association for Computing Machinery.

Keywords

Speech corpus development, Speech recognition system