Paper title:

A Study of the Effect of Emotional State upon the Variation of the Fundamental Frequency of a Speaker

Published in: Issue 1, (Vol. 4) / 2010
Publishing date: 2010-03-30
Pages: 79-82
Author(s): Marius V. Giurcau, Adrian Lodin, Corneliu Rusu
Abstract. Telephone banking or brokering, building access systems or forensics are some of the areas in which speaker recognition is continuously developing. Fundamental frequency represents an important speech feature used in these applications. In this paper we present a study of the effect of emotional state of a speaker upon the variation of the fundamental frequency of the speech signal. Human beings are quite frequently overwhelmed by various emotions and most of the time one can not really control these emotional states. For the purpose of our work we have used the Berlin emotional speech database which contains utterances of 10 speakers in different emotional situations: happy, angry, fearful, bored and neutral. The mean fundamental frequency and also the standard deviation for every speaker in all the emotional states were computed. The results show a very strong influence of the emotional state upon frequency variation.
Keywords: Speech Signal Processing, Speaker Recognition, Fundamental Frequency, Emotions.
References:

1. T. D. Ganchev, Speaker Recognition, PhD thesis, University of Patras, Greece, 2005.

2. D.A. Reynolds, R.C. Rose, “Robust text-independent speaker identification using gaussian mixture speaker models”, IEEE Trans. Speech and Audio Processing, vol. 3, no. I, pp. 72-82, 1995.

3. R. Saeidi, H. R. Sadegh Mohammadi, R. D. Rodman, T. Kinnunen, “A new segmentation algorithm combined with transient frames power for text independent speaker verification”, in Proc. IEEE ICASSP ’07, Hawaii, US, 2007.

4. T. Kinnunen, E. Karpov, P. Franti, “Real-time speaker identification and verification”, in IEEE Transactions on Audio, Speech and Language Processing, vol. 14. no. 1, pp. 277-288, Jan. 2006.

5. M. Ghiurcau, C. Rusu, “A Study of the Effect of the Emotional State upon Text-Independent Speaker Identification”, unpublished.

6. W. Hunag, J. Chao, Y. Zhang, “Combination of pitch and MFCC GMM supervectors for speaker verification”, ICALIP2-08, July 2008, Shanghai.

7. H. Ezzaidi, J. Rouat, “Pitch and MFCC dependent models for speaker identification systems”, CCECE ’04, May 2004, Canada.

8. T. Kinnunen, R. G. Hautamaki, “Long-Term F0 Modeling for Text-Independent Speaker Recognition”.

9. F. Nolan, “The phonetic bases of Speaker recognition”, Cambridge University Press, Cambridge, 1983.

10. P. Rose, “Forensic Speaker Identification”, Taylor & Francis, New York, 2002.

11. E. G. Hautamaki, „Fundamental Frequency Estimation and Modeling for Speaker Recognition”, Master’s Thesis, University of Joensuu, Finland, 2005.

12. I. R. Titze. Principles of voice production. Prentice Hall, 1994

13. http://pascal.kgw.tu-berlin.de/emodb/start.html.

14. F. Burkhardt, A. Paeschke, M. Rolfes, W Sendlmeier, A Database of German Emorional Speech, Interspeech, 2005.

15. http://www.speech.kth.se/wavesurfer/.

Back to the journal content
Creative Commons License
This article is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License.
Home | Editorial Board | Author info | Archive | Contact
Copyright JACSM 2007-2024