Presentation Information
[16a-M_110-6]CodeSegNet: Measurement-Domain Segmentation for Snapshot Hyperspectral Imaging Using Single Coded Frames
〇DEEPAK GANESH SHARMA1, RAHUL KUMAR1, HIROYUKI OKINO1 (1.Research and Development Group, Hitachi Ltd.)
Keywords:
Hyperspectral Imaging,Semantic Segmentation,Deep Learning
Hyperspectral Imaging (HSI) provides rich spectral–spatial data for material characterization and scene analysis. Traditional segmentation frameworks, such as DSSNet (Dilated Semantic Segmentation Network), rely on processing fully generated 3D HSI cubes, which require substantial memory and computation, resulting in slow inference even with modest models (~200K parameters). This bottleneck impedes practical adoption in time-sensitive applications.
To address this limitation, CodeSegNet (Coded Image Segmentation Network) is introduced as a novel architecture that directly processes single coded measurements from snapshot-type HSI systems (e.g., CASSI [Coded-Aperture Snapshot Spectral Imager], Fabry–Perot, and metasurface cameras), eliminating the need for computationally intensive HSI cube generation. CodeSegNet learns an end-to-end mapping from coded frames to segmentation masks by integrating spatial texture with filter-aware spectral priors. Leveraging inherent spectral encoding enables early dimensionality reduction and efficient feature extraction. Spectral channels are compressed at the input via 1×1 Conv2D, and spatial dimensions are reduced using global average pooling (GAP), maximizing GPU utilization. Large contiguous block processing accelerates inference compared to cube-based models that maintain high-dimensional maps and rely on small-scale operations such as dilated convolutions.
Simulation-based evaluation demonstrates that CodeSegNet achieves ~93% segmentation accuracy with 4.2 s per frame, over 50% faster than DSSNet (~10 s, 98% segmentation accuracy). Despite a higher parameter count (~500K), architectural innovations deliver speedups with minimal accuracy loss, validating measurement-domain segmentation for practical deployment. This paradigm shift from cube-first to measurement-domain inference offers a scalable solution for industrial, biomedical, and environmental monitoring, where rapid and accurate scene understanding is critical.
To address this limitation, CodeSegNet (Coded Image Segmentation Network) is introduced as a novel architecture that directly processes single coded measurements from snapshot-type HSI systems (e.g., CASSI [Coded-Aperture Snapshot Spectral Imager], Fabry–Perot, and metasurface cameras), eliminating the need for computationally intensive HSI cube generation. CodeSegNet learns an end-to-end mapping from coded frames to segmentation masks by integrating spatial texture with filter-aware spectral priors. Leveraging inherent spectral encoding enables early dimensionality reduction and efficient feature extraction. Spectral channels are compressed at the input via 1×1 Conv2D, and spatial dimensions are reduced using global average pooling (GAP), maximizing GPU utilization. Large contiguous block processing accelerates inference compared to cube-based models that maintain high-dimensional maps and rely on small-scale operations such as dilated convolutions.
Simulation-based evaluation demonstrates that CodeSegNet achieves ~93% segmentation accuracy with 4.2 s per frame, over 50% faster than DSSNet (~10 s, 98% segmentation accuracy). Despite a higher parameter count (~500K), architectural innovations deliver speedups with minimal accuracy loss, validating measurement-domain segmentation for practical deployment. This paradigm shift from cube-first to measurement-domain inference offers a scalable solution for industrial, biomedical, and environmental monitoring, where rapid and accurate scene understanding is critical.
