講演情報

14:15 〜 14:30

[5O3-IS-5b-02]Embedded Smart Voice Control System

HusanFu Wang², KuanYu Lee², 〇SULIN CHI¹, ChinChin Lin², YuJui Chen², KaiHung Fang² (1. Otemon Gakuin University, 2. National Formosa University)

regular

キーワード：

Mel-Frequency Cepstral Coefficients (MFCC)、Voiceprint Recognition、Deep Learning、Long Short-Term Memory (LSTM)、Human-Computer Interaction (HCI)

With the increasing demands of Internet of Things (IoT) and smart home technologies for advanced Human–Computer Interaction, this study proposes a novel interaction interface based on user-specific clap sound-print recognition as an alternative to traditional voice wake-up words. The proposed method addresses privacy concerns and reduces accidental triggering inherent in voice-based systems. A portable audio collection device is constructed using a low-cost microcontroller and digital microphone, and a four-class dataset is collected. Raw audio signals are amplified and segmented into ten time steps, from which 72-dimensional Mel-Frequency Cepstral Coefficient (MFCC) features, including first- and second-order derivatives, are extracted to form a (10, 72) temporal feature sequence. An LSTM network is used for classification, achieving a recognition accuracy of 97.25%. A real-time recognition system is further implemented using the Flask framework, enabling user-specific task execution. Experimental results demonstrate the feasibility and practical value of the proposed system as a personalized smart control hub.

セッション詳細へ戻る