AMERICAN SIGN LANGUAGE FINGERSPELLING RECOGNITION IN THE WILD WITH TEMPORAL AGGREGATION MODULE AND MULTITASK LEARNING
Conference proceedings article
Authors/Editors
Strategic Research Themes
Publication Details
Author list: พีระวัฒน์ พันธ์นัทธีร์, วุฒิพงษ์ คำวิลัยศักดิ์, ชัชวาลย์ หาญสกุลบรรเทิง, ณัฐนันท์ ทัดพิทักษ์กุล
Publication year: 2022
Start page: 670
End page: 689
Number of pages: 20
Languages: Thai (TH)
Abstract
This paper presents a Neural Network model-based method for American sign language
fingerspelling recognition using a real- world fingerspelling video dataset with various signers and
highly dynamic environments collected from the internet. We propose a Temporal Aggregation
module (TAGG) in our neural network architecture. The TAGG allows the neural network model to
aggregate the temporal correlation between consecutive frames across multiple frame lengths. It
makes the model robust to variability in spelling speed across signers and the duration of each
fingerspelling gesture. We also introduce multitask learning for fingerspelling recognition that
combines learning from Connectionist Temporal Classification (CTC) and learning from an attentionbased decoder. As a result, the proposed model can improve performance in highly dynamic video
environments by sharing knowledge across many learning tasks. According to the experimental
results, our method outperforms the prior methods in the character error rate aspect.
Keywords
No matching items found.