Presentation Information
[2K4-GS-7b-05]An Image Style Transformation Method for Diverse “Kawaii” Expressions
〇Hiroto Doi1, Masanori Akiyoshi1 (1. Kanagawa University)
Keywords:
Image Generation,Kawaii,Stable Diffusion
This study proposes a system for embodying and sharing individuals’ ambiguous mental images of “Kawaii.” Five sub-genres, including “Kimokawa” (creepy-cute), were trained as LoRA (Low-Rank Adaptation) models, allowing users to blend their application ratios via sliders. An interactive interface using a Large Language Model (LLM) translates natural language instructions, such as “make it more eerie,” into generation parameters, including denoising strength and ControlNet shape retention. A key feature is the storage of all generation parameters as a JSON-formatted “blueprint.” This enables reproduction by third parties and the transfer of “Kawaii” aesthetics to different subjects, addressing the transience of image generation. Part-aware control was also realized by applying styles only to user-specified regions based on semantic segmentation using DINOv2. A user survey was conducted to evaluate the system. Results showed that more than half of participants’ perceived styles matched the generated images, and the validity of aesthetic transfer via blueprints was positively evaluated by a majority. These findings suggest the potential of this method for sharing and reusing subjective sensibilities as structured data.
