Data Augmentation Pipeline for Text Recognition of Laser-Etched Serial Number Images

Conference proceedings article


Authors/Editors


Strategic Research Themes


Publication Details

Author listTheerapat Niamhom, Worarat Krathu, Pornchai Mongkolnam

Publication year2025

Start page1

End page6

Number of pages6

URLhttps://rivf2025.org/?page_id=371

LanguagesEnglish-United States (EN-US)


Abstract

This paper addresses the challenging task of recognizing severely degraded laser-etched serial numbers on copper surfaces, where conventional OCR and scene text recognition models struggle due to noise, incomplete characters, and limited training data. To overcome these challenges, we proposed a data augmentation pipeline leveraging generative adversarial networks (GAN) for finetuning a transformer-based text recognition model. The pipeline consists of 3 three main modules: (1) character-free background generation module using a LaMa inpainting model to remove existing characters from real images, (2) Character Generation module using four CycleGAN models trained on different appearance groups to synthesize diverse textured characters to create character images, and (3) a text insertion module combines these synthesized characters onto the generated backgrounds. Fine-tuning the pretrained TrOCR model with our augmented dataset achieved an absolute accuracy of 88.5%, representing a 72.79% improvement over the baseline. This highlights the effectiveness of our approach for extreme degradation and its potential for broader industrial serial number recognition applications.




Keywords

Computer Visiondata augmentationGenerative Adversarial Networks (GAN)Text Recognition


Last updated on 2026-04-02 at 00:00