Presentation Information

10:35 AM - 11:05 AM JST(1:35 AM - 2:05 AM UTC)

[16a-K204-3]Speech Processing based on Deep Learning Technologies

〇Akinori Ito¹ (1.Tohoku Univ.)

Keywords:

Speech processing,Deep learning,End-to-end models

The advent of deep learning has brought about a paradigm shift in speech processing. End-to-end models, capable of processing raw audio waveforms, have become the norm. These models have significantly improved the performance of tasks such as speech recognition and synthesis. However, challenges such as data scarcity and ethical concerns, particularly related to deepfakes, persist. Future research should focus on developing models that are more robust to diverse data and addressing the ethical implications of these technologies.

Comment

To browse or post comments, you must log in.Log in

Back to Session information