LCCCRN: Robust deep learning-based speech enhancement

Chun-Yin YEUNG, Wai Yin Steve MUNG, Yat Sze CHOY, Daniel P. K. LUN

Research output: Chapter in Book/Report/Conference proceedingChapters

Abstract

Deep learning-based speech enhancement methods make use of their non-linearity properties to estimate the speech and noise signals, especially the non-stationary noise. DCCRN, in particular, achieves state-of-the-art performance on speech intelligibility. However, the non-linear property also causes concern about the robustness of the method. Novel and unexpected noises can be generated if the noisy input speech is beyond the operation condition of the method. In this paper, we propose a hybrid framework called LDCCRN, which integrates a traditional speech enhancement method LogMMSE-EM and DCCRN. The proposed framework leverages the strength of both approaches to improve the robustness in speech enhancement. While the DCCRN continues to remove the non-stationary noise in the speech, the novel noises generated by DCCRN, if any, are effectively suppressed by LogMMSE-EM. As shown in our experimental results, the proposed method achieves better performance over the traditional approaches measured with standard evaluation methods. Copyright © 2022 Society of Photo-Optical Instrumentation Engineers (SPIE).

Original languageEnglish
Title of host publicationProceedings of International Workshop on Advanced Imaging Technology (IWAIT) 2022
PublisherSPIE
ISBN (Electronic)9781510653313
DOIs
Publication statusPublished - Apr 2022

Citation

Yeung, C.-Y., Mung, S. W. Y., Choy, Y. S., & Lun, D. P. K. (2022). LCCCRN: Robust deep learning-based speech enhancement. In Proceedings of International Workshop on Advanced Imaging Technology (IWAIT) 2022. Retrieved from https://doi.org/10.1117/12.2626108

Fingerprint

Dive into the research topics of 'LCCCRN: Robust deep learning-based speech enhancement'. Together they form a unique fingerprint.