Presentation Information
[1Yin-A-28]Memory-Efficient Algorithm for DualSoftmax-based Feature Matching
〇Yoshio Kato1, Shuhei Tarashima1 (1. NTT DOCOMO BUSINESS)
Keywords:
Image Matching,GPGPU,DualSoftmax
DualSoftmax-based Matching (DSM) has been widely adopted as a reasonable approach to establish correspondence between local features in the field of image matching. However, DSM intrinsically requires storing the full pairwise similarity matrix, leading to O(H^2W^2) memory consumption with respect to the dimension of the feature map HW. This quadratic complexity becomes a severe bottleneck for highly precise semi-dense matching architecture such as LoFTR. By following the memory-efficient implementations for attention such as FlashAttention,
we propose FlashDSM, a tiling-based algorithm that performs DSM inference without allocating the entire similarity matrix. Our FlashDSM improves memory usage from O(H^2W^2) to O(HW) without any approximations, and our experiments show that the computation time gets ×13.5 faster and the memory consumption gets ×50.5 lower in evaluation than the naive PyTorch implementation on the resolution of 1280×720.
we propose FlashDSM, a tiling-based algorithm that performs DSM inference without allocating the entire similarity matrix. Our FlashDSM improves memory usage from O(H^2W^2) to O(HW) without any approximations, and our experiments show that the computation time gets ×13.5 faster and the memory consumption gets ×50.5 lower in evaluation than the naive PyTorch implementation on the resolution of 1280×720.
