Affective Model Based Speech Emotion Recognition Using Deep Learning Techniques

Authors

  •   D. Karthika Renuka Department of IT, PSG College of Technology, Coimbatore 641 004, Tamil Nadu
  •   C. Akalya Devi Department of IT, PSG College of Technology, Coimbatore 641 004, Tamil Nadu
  •   R. Kiruba Tharani Department of IT, PSG College of Technology, Coimbatore 641 004, Tamil Nadu
  •   G. Pooventhiran Department of IT, PSG College of Technology, Coimbatore 641 004, Tamil Nadu

DOI:

https://doi.org/10.17010/ijcs/2020/v5/i4-5/154783

Keywords:

Emotion Recognition

, RNN, Speech, Neural Network.

Manuscript Received

, May 22, 2020, Revised, August 10, Accepted, August 16, 2020. Date of Publication, September 5, 2020.

Abstract

Human beings express emotions in multiple ways. Some common ways that emotions are expressed are through writing, speech, facial expression, body language or gesture. In general, it is believed that emotions are, first and foremost, internal feelings and experience. Speech is a powerful form of communication that is accompanied by the speaker's emotions. Specific prosodic signs, such as pitch variation, frequency, speech speed, rhythm, and voice quality, are accessible to speakers to express and listeners to interpret and decode the full spoken message. This paper aims to establish an affective model based speech emotion recognition system using deep learning techniques such as RNNwith LSTMon German and English Language datasets.

Downloads

Download data is not yet available.

Downloads

Published

2020-10-01

How to Cite

Karthika Renuka, D., Akalya Devi, C., Kiruba Tharani, R., & Pooventhiran, G. (2020). Affective Model Based Speech Emotion Recognition Using Deep Learning Techniques. Indian Journal of Computer Science, 5(4&5), 9–17. https://doi.org/10.17010/ijcs/2020/v5/i4-5/154783

References

O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, “State-of-the-art in artificial neural network applications: A survey,†Heliyon, vol. 4, no. 11, 2018.

D. Kollias, M. Yu, A. Tagaris, G. Leontidis, A. Stafylopatis, and S. Kollias, “Adaptation and contextualization of deep neural network models,†In 2017 IEEE Symposium Series on Computational Intelligence (SSCI)(pp.1–8).IEEE.doi : https://doi.org/10.1109/SSCI.2017.8280975

K. Han, D. Yu, and I. Tashev, “Speech emotion recognition using deep neural network and extreme learning machine,†In Fifteenth annual conference of the international speech communication association, pp. 223–227, 2014. Retrieved from https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/IS140441.pdf

A. M. Badshah, N. Rahim, N. Ullah, J. Ahmad, K. Muhammad, M. Y. Lee, and S. W. Baik, “Deep features-based speech emotion recognition for smart affective services,†Multimedia Tools and Applications, vol. 78, pp.5571–5589, 2017.Doi : https://doi.org/10.1007/s11042-017-5292-7

Y. Saito, S. Takamichi, and H. Saruwatari, “Statistical parametric speech synthesis incorporating generative adversarial networks,†IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 1, pp. 84– 96, 2018. Doi: 10.1109/TASLP.2017.2761547

X. Zhou, G. Junqi, and R. Bie, “Deep learning based affective model for speech emotion recognition,†In 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), pp.841– 846. IEEE.

T. Zhang, and j. Wu, “Speech emotion recognition with i-vector feature and RNN model,†In 2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), pp. 524 –528, 2015.IEEE. Doi:https://doi.org/10.1109/ChinaSIP.2015.7230458

K. Y. Huang, C. H. Wu, T. H. Yang, M. H. Su, and J. H. Chou, (2016, December). Speech emotion recognition u Chou, “Speech emotion recognition using autoencoder bottleneck features and LSTM,†In 2016 International Conference on Orange Technologies (ICOT), pp. 1– 4, IEEE. Doi: https://doi.org/10.1109/ICOT.2016.8278965

S. An, Z. Ling, and L. Dai, “Emotional statistical parametric speech synthesis using LSTM-RNNs,†In 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPAASC), pp.1613–1616.IEEE. Doi:https://doi.org/10.1109/APSIPA.2017.8282282

S. L. Rose, L. A. Kumar, and D. K. Renuka, Deep learning using Python, 2019.