54 / 2023-08-30 12:38:24
Improved Vocal Tract Length Perturbation for Improving Child Speech Emotion Recognition
child speech emotion recognition,vocal tract length normalisation,vocal tract length perturbation,data augmentation
终稿
Shenkang Qu / Dongguan University of Technology
Yong Qin / Dongguan University of Technology
Xuwen Qin / Shenzhen International Exchange College
Ziliang Ren / Dongguan University of Technology
Lei Chen / Dongguan University of Technology
Due to the variability of children's acoustic features during growth and the lack of a publicly available corpus of children's speech emotion data, training a robust children's Speech Emotion Recognition(SER) is more challenging than that of adults.We develop in this paper a new SER learning solution for children, which generates a child-like speech spectrum from an adult's speech spectrum based on vocal tract length normalization (VTLN) and vocal tract length perturbation (VTLP). This improved adult spectrum is then used as augmented data to improve the child speech emotion recognition system. To validate the effectiveness of this data augmentation method, we test it in two mainstream SER task networks based on Compact Convolutional Transformers(CCT) and Convolutional Neural Networks(CNN), respectively, and the proposed data augmentation method improves the accuracy and UAR by 0.32% and 0.22%, respectively, on the Fau Aibo children's emotion data corpus compared to the VTLP baseline system when trained with CCT. When trained with CNN, the accuracy and UAR are improved by 0.39% and 0.26%, respectively, compared to the VTLP baseline system.

 
重要日期
  • 会议日期

    11月02日

    2023

    11月04日

    2023

  • 12月15日 2023

    初稿截稿日期

  • 12月20日 2023

    注册截止日期

主办单位
IEEE Instrumentation and Measurement Society
Xidian University
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询