Enhancing Speaker Recognition Robustness with Scalable Deep Learning Models and MFCC Features

Main Article Content

Yasir Hussein Shakir
Eshaq Aziz Awadh AL Mandhari
Ali Alkhazraji
Reem Ali Mutlag

Abstract

Speaker recognition is the process of distinguishing various speakers within recordings of sounds or stream. Several variables contribute to the task's complexity, including variances in structure, overlapping sound events, as well as the presence of multiple noise sources after recorded. Despite the plethora of algorithms that have been developed to extract this data for identification purposes, capturing speaker-specific attributes from the often intricate sound mix is still a difficulty for machines. Earlier methods have used discriminative models to decode voice data, but with increasing computation capability, generative models are taking some ground. While they are functional for various speech types missing transition or clarity, the scalability of these models is questionable. To address this issue in this paper, the different databases used to train deep learning models like the Feed Forward Neural Network (FFNN), Forward Cascade Back Propagation (FCBP), and Elman Propagation Neural Network (EPNN) are trained in such a way that addresses scalability problems of the models.

Article Details

Section

Articles

How to Cite

Enhancing Speaker Recognition Robustness with Scalable Deep Learning Models and MFCC Features. (2025). East Journal of Computer Science, 1(5), 1-16. https://doi.org/10.63496/ejcs.Vol1.Iss5.185

Similar Articles

You may also start an advanced similarity search for this article.