The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[1Yin-B-33]Investigating Layer Skipping in Diffusion Language Models

〇Yuto Karashima¹, Hikari Otsuka¹, Tatsuya Kaneko¹, Masato Motomura¹, Daichi Fujiki¹ (1. Institute of Science Tokyo)

Keywords:

generative model,diffusion language model,AI

拡散言語モデルは，トークンの並列生成能力により従来の自己回帰モデルを凌駕する可能性を秘めている．しかし，自己回帰モデルと異なり，拡散言語モデルは全レイヤで全トークンに対して演算を要するため，推論時に膨大な計算コストがかかる課題を持つ．そこで本研究では，拡散言語モデル用に拡張した新たなレイヤスキップ機構を導入することで，この計算コストの削減を試みる．代表的な拡散言語モデルである LLaDA 8B を用いた評価の結果，レイヤ削減率と精度の間に明確なトレードオフが確認された．

Comment

To browse or post comments, you must log in.Log in

Back to Session information