Presentation Information
[4O4-IS-2b-03]Deep Learning with Reliable Sample Selection for Drosophila 3D Image Classification
〇Md. Humaun Kabir1,2, Md. Al Mehedi Hasan2, Walker Peterson5, Koji Tabata2,3, Masahiro Sonoshita4, Keisuke Goda4,5,6, Tamiki Komatsuzaki1,2,3 (1. Graduate School of Chemical Sciences and Engineering, Hokkaido University, Sapporo 060-8628, Japan, 2. Research Institute for Electronic Science, Hokkaido University, Sapporo 001-0020, Japan, 3. Institute for Chemical Reaction Design and Discovery, Hokkaido University, Sapporo 001-0021, Japan, 4. Division of Biomedical Oncology, Institute for Genetic Medicine, Hokkaido University, Sapporo 060-0815, Japan, 5. Department of Chemistry, The University of Tokyo, Tokyo 113-0033, Japan, 6. Department of Bioengineering, University of California, Los Angeles, CA 90095, USA)
regular
Keywords:
Reliable Sample Selection,Selective Classification,Label Noise,Deep Learning,RISAN
Accurate data quality assessment is essential for the reliable training of deep neural networks in 3D image-based biomedical research. Manual annotation of large-scale 3D datasets is time-consuming, subjective, and prone to label noise, which can significantly degrade model performance. To address these challenges, we develop an automated and effective sample selection framework for identifying clean and reliable training data for DNNs. The implemented system is evaluated on both a label-noisy CIFAR-10 benchmark dataset and a domain-specific Good/Bad labeled 3D image dataset of Drosophila larvae intended for drug screening applications. Raw volumetric data are preprocessed by removing invalid samples, applying Z-score normalization, resizing inputs to match the network architecture, and organizing volumes into three-channel tensors. A pretrained ResNet-18 model is employed as the baseline DNN and is integrated with state-of-the-art sample selection and rejection techniques, including TAkS, SelectiveNet, and RISAN. Experimental results demonstrate that the RISAN-based model outperforms competing approaches in mitigating label noise and accurately identifying reliable samples. By using adaptive sample selection in training and rejection at inference, the framework reduces human labeling bias and manual curation while improving model stability and data reliability.
