钟顺明, 况鹏, 庄豪爽, 冯韩德, 王剑莹, 张涵. 基于PNCC与基频的鲁棒电话语音性别检测方案[J]. 华南师范大学学报(自然科学版), 2019, 51(6): 118-122. doi: 10.6054/j.jscnun.2019111
引用本文: 钟顺明, 况鹏, 庄豪爽, 冯韩德, 王剑莹, 张涵. 基于PNCC与基频的鲁棒电话语音性别检测方案[J]. 华南师范大学学报(自然科学版), 2019, 51(6): 118-122. doi: 10.6054/j.jscnun.2019111
ZHONG Shunming, KUANG Peng, ZHUANG Haoshuang, FENG Hande, WANG Jianying, ZHANG Han. A Robust Gender Recognition Scheme for Telephone Speech Based on PNCC and Fundamental Frequency[J]. Journal of South China Normal University (Natural Science Edition), 2019, 51(6): 118-122. doi: 10.6054/j.jscnun.2019111
Citation: ZHONG Shunming, KUANG Peng, ZHUANG Haoshuang, FENG Hande, WANG Jianying, ZHANG Han. A Robust Gender Recognition Scheme for Telephone Speech Based on PNCC and Fundamental Frequency[J]. Journal of South China Normal University (Natural Science Edition), 2019, 51(6): 118-122. doi: 10.6054/j.jscnun.2019111

基于PNCC与基频的鲁棒电话语音性别检测方案

A Robust Gender Recognition Scheme for Telephone Speech Based on PNCC and Fundamental Frequency

  • 摘要: 针对电话语音性别检测存在识别准确率较低的问题,提出了一种有效的电话语音性别检测方案(CNN+SVM); 首先,采用卷积神经网络(Convolutional Neural Network, CNN)提取幂律归一化倒谱系数(Power-Normalized Cepstral Coefficient, PNCC)的有效信息;然后, 结合优化后的基频特征,选用支持向量机(Support Vector Machine, SVM)实现性别分类.该方案有效融合了男、女发音和听觉感知特性上的差异,同时利用了CNN特征提取能力以及SVM鲁棒分类能力.仿真结果表明:CNN+SVM方案针对实际场景电话语音数据集的性别识别准确率优于传统识别方法.

     

    Abstract: In view of the low recognition accuracy of telephone voice gender detection, an effective gender detection scheme for telephone speech is proposed. Firstly, the Convolutional Neural Network (CNN) is used to extract the effective information of Power-Normalized Cepstral Coefficient (PNCC), and then Support Vector Machine (SVM) is selected to realize gender classification based on the optimized fundamental frequency features. The proposed scheme can effectively study the differences of male and female's pronunciation and auditory perception characteristics, and can benefit from the ability of CNN feature extraction and SVM robust classification. Experimental results show that the proposed scheme outperforms the traditional methods in gender recognition accuracy for the telephone speech data set in practical scenarios.

     

/

返回文章
返回