講演情報
[6B-05]Incorporating Temporal Dynamics and Intricate Spatial-Temporal
Dependencies for ActionFormer
*Zhao Kunpeng1、大北 剛1、宮崎 朝陽1 (1. 九州工業大学知能情報工学研究系)
発表者区分:学生
論文種別:ロングペーパー
インタラクティブ発表:あり
論文種別:ロングペーパー
インタラクティブ発表:あり
キーワード:
Transformer、Human Activity Recognition、Deep Learning
Human Activity Recognition (HAR) relies on extracting meaningful patterns from sensor or visual data to classify activities. This paper introduces the CE-HAR framework, which incorporates channel-wise enhancement into Transformer-based architectures to improve HAR performance. Two modules, Adaptive Channel-wise Enhancement (AdapCE) and MaxPool Channel-wise Enhancement (MaxCE), are developed to address the challenges of high temporal dynamics and intricate spatial-temporal dependencies in inertial and visual features, respectively. Experiments on the WEAR dataset show that CE-HAR achieves substantial improvement, with a 16.01\% improvement in average mAP for inertial data and a 7.8\% improvement for visual data compared to the baseline ActionFormer. Additional evaluations on benchmark datasets confirm the robustness and effectiveness of the proposed approach. Our findings highlight the importance of channel-wise information processing and open new avenues for advancing HAR techniques.