Presentation Information

[1Yin-B-29]Fine-grained Control of Diffusion Models via Soft Labels and Feature Space Coordinates for Effective Data Augmentation

〇Keisuke Hayata1 (1. KONICA MINOLTA, INC.)

Keywords:

Diffusion Models,Data Augmentation,Controllable Generation,Soft Labels,Active Learning

Conventional diffusion-based data augmentation fails to capture intra-class variability, and text or reference-based control struggles with pinpoint specifications like class boundaries. This study proposes a fine-grained generative control method using feature space coordinates to achieve efficient data augmentation.
Our method extracts features with DINOv3 and assigns soft labels to large unlabeled datasets via label propagation from a small labeled subset. These soft labels and UMAP-derived coordinates are independently embedded and used as conditioning inputs. By synergistically leveraging soft labels for uncertainty and coordinates for geometric variations, the method achieves sophisticated control difficult for existing approaches. Entropy-based filtering further ensures label reliability and stabilizes training.
Experiments on MNIST demonstrate that our method trains a high-quality generative model using 60,000 images from only 100 labeled samples (10 per class). Compared to conventional methods, the proposed approach achieves superior controllability and image quality. This method enables the generation of critical data near class boundaries and is expected to significantly enhance the learning efficiency of subsequent classification models.