AL-Lisaniyyat
Volume 30, Numéro 2, Pages 148-169
2024-12-30
Authors : Djeffal Noussaiba . Addou Djamel . Kheddar Hamza . Selouani Sid Ahmed .
This paper presents a hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) approach for Automatic Speech Recognition (ASR) using deep learning techniques on the Aurora-2 dataset. The dataset includes both clean and multi-condition modes, encompassing four noise scenarios: subway, babble, car, and exhibition hall, each evaluated at different signal-to-noise ratios (SNRs), and clean condition, and the results are compared with those from the ASC-10 dataset and the ESC-10 dataset. The problem addressed is the need for robust ASR models that perform well in both clean and noisy environments. The aim of utilizing the CNN-LSTM architecture is to enhance the recognition performance by combining the strengths of CNNs and LSTMs, rather than relying on either CNNs or LSTMs alone. Experimental results demonstrate that the combined CNN-LSTM model achieves superior classification performance, in clean environments on the Aurora2 dataset, attaining an accuracy of 97.96%, surpassing the individual CNN and LSTM models, which achieved 97.21% and 96.06%, respectively. In noisy conditions, the hybrid model also outperforms the standalone models, with an accuracy of 90.72%, compared to 90.12% for CNN and 86.12% for LSTM. These findings indicate that the CNN-LSTM model is more effective in handling various noise conditions and improving overall ASR accuracy.
ASR, CNN, LSTM, clean speech, noisy speech, CNN-LSTM, DNN, SNR.
Ridha Ilyas Bendjillali
.
Mohammed Sofiane Bendelhoum
.
Ali Abderrazak Tadjeddine
.
Miloud Kamline
.
pages 144-152.
Boubakeur Khadidja Nesrine
.
Debyeche :mohamed
.
pages 183-195.
Benmoussat N
.
Belbachir M.f
.
pages 11-19.
Addou Asmaa
.
Mazouz Nacera
.
pages 36-44.