Thai Question Text-To-SQL Parsing Using Transformer
อื่นๆ
ผู้เขียน/บรรณาธิการ
กลุ่มสาขาการวิจัยเชิงกลยุทธ์
รายละเอียดสำหรับงานพิมพ์
รายชื่อผู้แต่ง: Tungruethaipak N., Prom-On S.
ผู้เผยแพร่: Institute of Electrical and Electronics Engineers Inc.
ปีที่เผยแพร่ (ค.ศ.): 2024
หน้าแรก: 631
หน้าสุดท้าย: 637
จำนวนหน้า: 7
ISBN: 979-835038176-4
ภาษา: English-Great Britain (EN-GB)
บทคัดย่อ
This paper introduces a novel approach for trans-lating Thai natural language utterances into Structured Query Language (SQL) and establishes a baseline in this burgeoning field. SQ L serves as a pivotal language for communication and executing diverse tasks within databases. While prior research in text-to-SQL parsing has predominantly centered on English with some exploration in Chinese, the absence of resources for low-resource languages like Thai presents a significant challenge. To address this gap, we constructed a Thai version of the Spider dataset-a benchmark dataset featuring cross-domain samples, multiple tables, and complex queries-specifically tailored for Thai language processing tasks. Challenges arise from Thai's unique word segmentation coupled with the presence of SQL keywords and database table columns expressed in English. To establish a baseline, we leverage fine-tuned mT5 [24], a transformer-based large language model developed by Google, which inherently supports multiple languages. This study marks a pivotal step towards advancing natural language understanding and SQL translation for Thai, shedding light on critical research avenues in multilingual text-to-SQL parsing. Which is able to get significant performance improvement of at least 80% to 97% for different SQL components © 2024 IEEE.
คำสำคัญ
mT5, Spider dataset, SQL, Text to SQL