AMERICAN SIGN LANGUAGE FINGERSPELLING RECOGNITION IN THE WILD WITH TEMPORAL AGGREGATION MODULE AND MULTITASK LEARNING

Conference proceedings article


Authors/Editors


Strategic Research Themes


Publication Details

Author listพีระวัฒน์ พันธ์นัทธีร์, วุฒิพงษ์ คำวิลัยศักดิ์, ชัชวาลย์ หาญสกุลบรรเทิง, ณัฐนันท์ ทัดพิทักษ์กุล

Publication year2022

Start page670

End page689

Number of pages20

LanguagesThai (TH)


Abstract

This paper presents a Neural Network model-based method for American sign language
fingerspelling recognition using a real- world fingerspelling video dataset with various signers and
highly dynamic environments collected from the internet. We propose a Temporal Aggregation
module (TAGG) in our neural network architecture. The TAGG allows the neural network model to
aggregate the temporal correlation between consecutive frames across multiple frame lengths. It
makes the model robust to variability in spelling speed across signers and the duration of each
fingerspelling gesture. We also introduce multitask learning for fingerspelling recognition that
combines learning from Connectionist Temporal Classification (CTC) and learning from an attentionbased decoder. As a result, the proposed model can improve performance in highly dynamic video
environments by sharing knowledge across many learning tasks. According to the experimental
results, our method outperforms the prior methods in the character error rate aspect.


Keywords

No matching items found.


Last updated on 2022-02-08 at 23:05