Presentation Information
[4Yin-A-57]Zero-Shot Triggering of Homograph Reading with Brackets using Pre-trained T5 Language Models
〇Tomoya Hirata1, Kazuhiro Takeuchi1 (1. Osaka Electro-Communication University)
Keywords:
Homograph,Reading Disambiguation,T5,Zero-Shot
Japanese homographs share the same orthography but take different readings depending on context. Most prior work formulates reading disambiguation as classification with word-specific label sets. This paper reformulates the task as constrained text generation and applies a pre-trained Japanese T5 model in a zero-shot setting. We mask only the reading span in bracket-style furigana annotations and decode the reading in hiragana with token-level vocabulary constraints. Evaluation on 33,877 Tatoeba examples over 248 homographs shows that zero-shot T5 reaches 0.863 micro accuracy, while GiNZA reaches 0.915. The analysis reveals clear word-level complementarity, where T5 improves context-driven cases but remains weak on lexicalized readings.
