Presentation Information

[B-7-33]Chunked KV Cache Control for Efficient Vision-Language Model Inference

◎Kenshiro Wada1, Kenzo Okuda1, Hiroki Baba1, Naoki Kimishima1, Kentaro Hayashi1, Tomonori Takeda1 (1. NTT)

Keywords:

LLM,VLM,Transformer,KV Cache,Prefill,In Network Computing

Password required to view