IDF-Sign: Addressing Inconsistent Depth Features for Dynamic Sign Word Recognition

บทความในวารสาร

ผู้เขียน/บรรณาธิการ

โกสินทร์ จำนงไทย

กลุ่มสาขาการวิจัยเชิงกลยุทธ์

การเปลี่ยนแปลงด้วยเทคโนโลยีดิจิตอล (รูปแบบการวิจัยเชิงกลยุทธ์)

รายละเอียดสำหรับงานพิมพ์

รายชื่อผู้แต่ง: Abdullahi S.B.; Chamnongthai K.

ผู้เผยแพร่: Institute of Electrical and Electronics Engineers

ปีที่เผยแพร่ (ค.ศ.): 2023

วารสาร: IEEE Access (2169-3536)

Volume number: 11

หน้าแรก: 88511

หน้าสุดท้าย: 88526

จำนวนหน้า: 16

นอก: 2169-3536

eISSN: 2169-3536

URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85168269670&doi=10.1109%2fACCESS.2023.3305255&partnerID=40&md5=065a2eaa4234a88cbcc57a945ed197ef

ภาษา: English-Great Britain (EN-GB)

ดูบนเว็บไซต์ของสำนักพิมพ์

บทคัดย่อ

Inconsistent hand and body features pose barriers to sign language recognition and translation leading to unsatisfactory models. Existing recognition models are built up on the spatial-temporal depth Sp features. Finding suitable expert features for the Sp model is challenging especially for dynamic sign words because many inconsistent features exist across hand motions and shapes. In this article, we propose IDF-Sign: an efficient and consistent Sp model from a spatial-temporal multivariate pairwise consistency feature ranking (PairCFR) approach. The temporal features are obtained by computing the 3D position vector of skeletal hand joint coordinates, while the spatial features were obtained by taking every ten spatial coordinates in the 3D video frames and averaging it and doing so until the end of the frames. The PairCFR was used to rank and select the best Sp model features at different feature thresholds. We employed a threshold selection to compute a mid-point value of each ranked feature according to its weight. The receiver operating characteristics (ROC) scheme was employed to identify the relationship between the sensitive parameters and the Sp features, and the obtained values were utilized as modeling inputs. To verify the IDF-Sign, we design a real-life experiment with a leap motion sensor (LMS) consisting of ten signers with a total of ninety dynamic sign words. LMS provides the depth videos, since depth videos are too dense for the Sp model to treat directly, we read the depth videos in comma-separated files in real time. Extensive IDF-Sign evaluations using machine learning on ASL, GSL, DSG, and ASL-similar datasets prove the Optimized Forest achieved an average recognition performance of 95%, 78%, 65.07%, and 95% of the top-1, respectively. ฉ 2013 IEEE.

คำสำคัญ

3D video processing, Automatic sign language recognition, depth sensor, Hand gesture