Data Augmentation Pipeline for Text Recognition of Laser-Etched Serial Number Images
Conference proceedings article
Authors/Editors
Strategic Research Themes
Publication Details
Author list: Theerapat Niamhom, Worarat Krathu, Pornchai Mongkolnam
Publication year: 2025
Start page: 1
End page: 6
Number of pages: 6
URL: https://rivf2025.org/?page_id=371
Languages: English-United States (EN-US)
Abstract
This paper addresses the challenging task of recognizing severely degraded laser-etched serial numbers on copper surfaces, where conventional OCR and scene text recognition models struggle due to noise, incomplete characters, and limited training data. To overcome these challenges, we proposed a data augmentation pipeline leveraging generative adversarial networks (GAN) for finetuning a transformer-based text recognition model. The pipeline consists of 3 three main modules: (1) character-free background generation module using a LaMa inpainting model to remove existing characters from real images, (2) Character Generation module using four CycleGAN models trained on different appearance groups to synthesize diverse textured characters to create character images, and (3) a text insertion module combines these synthesized characters onto the generated backgrounds. Fine-tuning the pretrained TrOCR model with our augmented dataset achieved an absolute accuracy of 88.5%, representing a 72.79% improvement over the baseline. This highlights the effectiveness of our approach for extreme degradation and its potential for broader industrial serial number recognition applications.
Keywords
Computer Vision, data augmentation, Generative Adversarial Networks (GAN), Text Recognition






