The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

9:00 AM - 9:15 AM JST(12:00 AM - 12:15 AM UTC)

[2L1-GS-10t-01]Data augmentation using LLM for creating an advertising document classification model based on alpha diversity and beta diversity as indicators.

〇Satoshi Kawamoto^1,2, Toshio Akimitsu², Kikuo Asai² (1. Engineering Div. i-mobile Co., Ltd., 2. The Graduate School of Arts and Sciences, The Open University of Japan)

Keywords:

Internet Advertising,LLM,Data Augmentation

Text added to image/video ads can improve persuasion by conveying a product's appeal more clearly, but it also raises the risk of distributing ads with legally inappropriate expressions.
Building interpretable machine-learning models to detect such ads requires sufficient training data; LLM-based data augmentation is promising, yet generated texts must be evaluated for validity.

We propose indices to assess LLM outputs: alpha diversity, quantified by an IDF-weighted Hill Number to capture expressive variety, and beta diversity, quantified by a logistic-regression similarity model using features such as contextual fidelity, sentiment intensity, and grammatical structure.
Experiments suggest that LLMs can generate documents with advertisement-specific characteristics, but whether the generated texts can be reliably assigned to the appropriate document categories remains uncertain and requires further discussion.

Back to Session information