Presentation Information
[4E5-GS-11b-05]Evaluation of Prompt Injection Attacks for Extracting Sensitive Data
〇Kenichiro Hayasaka1, Yoshihiro Koseki1, Toshiki Okahara1 (1. Mitsubishi Electric Corporation)
Keywords:
Prompt Injections,LLM
Services that employ large language models (LLMs) can be subject to prompt injection attacks, in which malicious external prompts induce unintended behaviors. In this paper, we constructed a dataset of prompt-injection attacks designed to extract sensitive data that had been previously provided to the models, and we conducted experiments against multiple LLMs to evaluate whether these prompts cause leakage of the secrets. The dataset included not only prompts written in English text but also prompts written in Japanese and prompts rendered as images. Our experimental results show that the highest attack success rate was observed for Llama 4 Maverick; notably, attacks using Japanese prompts rendered as images exhibited particularly high success rates.
Comment
To browse or post comments, you must log in.Log in
