Research Article Open Access

Deep Learning Models for Speech Emotion Recognition

V.M. Praseetha1 and Sangil Vadivel1
  • 1 Birla Institute of Technology and Science Pilani, United Arab Emirates

Abstract

Emotions play a vital role in the efficient and natural human computer interaction. Recognizing human emotions from their speech is truly a challenging task when accuracy, robustness and latency are considered. With the recent advancements in deep learning now it is possible to get better accuracy, robustness and low latency for solving complex functions. In our experiment we have developed two deep learning models for emotion recognition from speech. We compare the performance of a feed forward Deep Neural Network (DNN) with the recently developed Recurrent Neural Network (RNN) which is known as Gated Recurrent Unit (GRU) for speech emotion recognition. GRUs are currently not explored for classifying emotions from speech. The DNN model gives an accuracy of 89.96% and the GRU model gives an accuracy of 95.82%. Our experiments show that GRU model performs very well on emotion classification compared to the DNN model.

Journal of Computer Science
Volume 14 No. 11, 2018, 1577-1587

DOI: https://doi.org/10.3844/jcssp.2018.1577.1587

Submitted On: 11 August 2018 Published On: 26 November 2018

How to Cite: Praseetha, V. & Vadivel, S. (2018). Deep Learning Models for Speech Emotion Recognition. Journal of Computer Science, 14(11), 1577-1587. https://doi.org/10.3844/jcssp.2018.1577.1587

  • 4,473 Views
  • 3,295 Downloads
  • 40 Citations

Download

Keywords

  • Deep Learning
  • Neural Network
  • Deep Neural Network
  • Recurrent Neural Network
  • Gated Recurrent Unit