[2K5-IS-1b-04] Synthetic Remote Sensing Images for Self-Supervised Pre-Training of Vision Transformers | 2025年度人工知能学会全国大会（第39回）

2025年度人工知能学会全国大会（第39回）

2025年5月27日〜5月30日大阪国際会議場＋オンライン

戻る

2025年度人工知能学会全国大会（第39回）

2025年5月27日〜5月30日大阪国際会議場＋オンライン

[2K5-IS-1b-04]Synthetic Remote Sensing Images for Self-Supervised Pre-Training of Vision Transformers

〇Luiz Henrique Mormille¹, Iskandar Salama¹, Masayasu Atsumi¹(1. Soka University)

Advancements in remote sensing image analysis often rely on high-quality datasets and robust model pre-training techniques. This work-in-progress explores the potential of synthetic remote sensing images for domain-specific pre-training of Vision Transformers (ViTs). Using textual inversion, we fine-tune a Stable Diffusion model to generate a large-scale dataset of 1 million high-quality synthetic remote sensing images. These images are then employed to pre-train a Vision Transformer on a self-supervised learning task, enabling the model to learn domain-specific representations effectively. The subsequent step involves transferring the knowledge from the pre-trained model to real-world remote sensing tasks. We hypothesize that pre-training on a large-scale, domain-specific dataset will enhance the performance of Vision Transformers when fine-tuned for real-world applications, particularly in scenarios where labeled data is limited. In addition to evaluating the impact of domain-specific pre-training on the downstream task performance, this study contributes to the research community by making its dataset publicly available, aiming to facilitate the research on the use synthetic data for remote sensing applications.

戻る

2025年度 人工知能学会全国大会（第39回）

2025年度 人工知能学会全国大会（第39回）

[2K5-IS-1b-04]Synthetic Remote Sensing Images for Self-Supervised Pre-Training of Vision Transformers

2025年度人工知能学会全国大会（第39回）

2025年度人工知能学会全国大会（第39回）