Presentation Information

[4Yin-A-28]Factor Analysis of Contextual Hallucinations Classification by a Linear Classifier Using Lookback Ratio Features

〇Yuto Sasaki1, Kenya Jin'no1 (1. TOKYO city univercity)

Keywords:

Contextual Hallcinations,Lookback Ratio,Attention Head,Attention Head Pruning,Summarization Evaluation

Contextual Hallucinations (CH)—summaries containing information not supported by the input document—remains a key challenge in LLM-based summarization. Prior work showed that Contextual Hallucinations can be detected by a linear classifier using Lookback Ratio (LR) features computed solely from attention maps. This study analyzes why LR-based linear classification works by examining contribution patterns at the attention-head level. We first visualize logistic-regression coefficients mapped to layers and heads, observing that high-contribution heads are distributed across multiple layers rather than concentrated in a single layer. We then conduct head-pruning interventions using top/median/bottom head sets selected by absolute coefficient magnitude and compare Contextual Hallucinations rates under varying pruning sizes. Results show regimes where pruning a small number of selected heads reduces Contextual Hallucinations, whereas larger pruning sizes increase Contextual Hallucinations, suggesting that changes in generated outputs may confound Contextual Hallucination-rate differences. CH labels are obtained via an LLM-as-a-judge procedure, and the findings summarize the relationship between LR-based detection and internal attention behavior.