Presentation Information

[3Yin-A-13]Japanese License Plate Recognition Using a Lightweight VLM Trained on Synthetic Data and Constrained Decoding

〇Kota Shinjo1, Shin Suyama1, Shintaro Yoshizawa1 (1. TOYOTA MOTOR CORPORATION)

Keywords:

License Plate Recognition,Vision Language Model,Computer Vision

Japanese license plate recognition is highly challenging due to the complexity of formats and the difficulty of data acquisition. This study proposes a license plate recognition method that fine-tunes a lightweight visual language model (VLM) with synthetic data and controls the output format through constrained decoding. Evaluation using 103 real images extracted from videos captured at Toyota Motor Corporation's Motomachi Plant showed that SFT improved the exact match rate from 10.2% to 79.6%. Additionally, 4-bit quantization reduced VRAM usage by 67% while maintaining accuracy. Furthermore, constrained decoding further improved the exact match rate to 85.4%.