Paper Details
Full Text [PDF]
Cite this
Jayson C. Sepeda, Louise Dawn F. Santos, and Prof. Leisyl Mahusay, 2025. An Enhancement of Gated Recurrent Unit (GRU) for Speech Emotion Recognition in the Implementation of Voice-Based Danger Recognition System. United International Journal for Research & Technology (UIJRT). 6(3), p17-25.
Abstract
As technology grows, fields like Speech Emotion Recognition (SER) are enriched. SER, which uses speech signals to recognize human emotions, is used in various services. Usage of Gated Recurrent Units (GRU) is prominent in SER. However, GRU faces the problem of overfitting where the model fails to properly generalize and fits too closely to the training data, causing poor performance on unseen data. The aim of this study is to simulate overfitting, and to solve it effectively. Different optimizations were tested to see what solution would best overcome the overfitting problem and to improve the test accuracy, the measurement of how the model performs on unseen data. Out of all the solutions, the combination of Dropout (20%), Batch normalization, and Xavier/Glorot Initialization produced the best improvements. The base model was able to produce only 63.21% accuracy on test data, with overfitting arising. Meanwhile, the optimized model was able to mitigate overfitting and was able to increase the test accuracy to 67.34%. This concludes that the enhanced model is less prone to overfitting which results in higher test accuracy. Furthermore, the enhanced model is implemented in a danger recognition system that aims to strengthen safety, especially for the speech impaired.
Keywords: Speech Emotion Recognition (SER), Gated Recurrent Units (GRU), Overfitting, Danger Recognition System, GRU Optimization.
Related Papers