Data Augmentation Pipeline for Text Recognition of Laser-Etched Serial Number Images
Conference proceedings article
ผู้เขียน/บรรณาธิการ
กลุ่มสาขาการวิจัยเชิงกลยุทธ์
รายละเอียดสำหรับงานพิมพ์
รายชื่อผู้แต่ง: Theerapat Niamhom, Worarat Krathu, Pornchai Mongkolnam
ปีที่เผยแพร่ (ค.ศ.): 2025
หน้าแรก: 1
หน้าสุดท้าย: 6
จำนวนหน้า: 6
URL: https://rivf2025.org/?page_id=371
ภาษา: English-United States (EN-US)
บทคัดย่อ
This paper addresses the challenging task of recognizing severely degraded laser-etched serial numbers on copper surfaces, where conventional OCR and scene text recognition models struggle due to noise, incomplete characters, and limited training data. To overcome these challenges, we proposed a data augmentation pipeline leveraging generative adversarial networks (GAN) for finetuning a transformer-based text recognition model. The pipeline consists of 3 three main modules: (1) character-free background generation module using a LaMa inpainting model to remove existing characters from real images, (2) Character Generation module using four CycleGAN models trained on different appearance groups to synthesize diverse textured characters to create character images, and (3) a text insertion module combines these synthesized characters onto the generated backgrounds. Fine-tuning the pretrained TrOCR model with our augmented dataset achieved an absolute accuracy of 88.5%, representing a 72.79% improvement over the baseline. This highlights the effectiveness of our approach for extreme degradation and its potential for broader industrial serial number recognition applications.
คำสำคัญ
Computer Vision, data augmentation, Generative Adversarial Networks (GAN), Text Recognition






