兰州理工大学学报 ›› 2021, Vol. 47 ›› Issue (5): 93-98.

• 自动化技术与计算机技术 • 上一篇    下一篇

基于多特征I-Vector的说话人识别算法

赵宏*, 岳鲁鹏, 常兆斌, 王伟杰   

  1. 兰州理工大学 计算机与通信学院, 甘肃 兰州 730050
  • 收稿日期:2019-12-19 出版日期:2021-10-28 发布日期:2021-11-18
  • 作者简介:赵 宏(1971-),男,甘肃西和人,博士,教授,博导.Email:594286500@qq.com
  • 基金资助:
    国家自然科学基金(51668043),赛尔网络下一代互联网技术创新项目(NGII20160311, NGII20160112)

Speaker recognition algorithm based on multi-featured I-Vector

ZHAO Hong, YUE Lu-peng, CHANG Zhao-bin, WANG Wei-jie   

  1. College of Computer and Communication, Lanzhou Univ. of Tech., Lanzhou 730050, China
  • Received:2019-12-19 Online:2021-10-28 Published:2021-11-18

摘要: 针对单一声学特征无法精准高效地辨识说话人身份的问题,提出了一种基于多特征I-Vector的说话人识别算法.该算法首先采集不同的声学特征并将其构成一个高维特征向量,然后通过主成分分析法有效地剔除高维特征向量的关联,确保各种特征之间正交化,最后采用概率线性判别分析进行建模和打分,并在一定程度上降低空间维度.在TIMIT语料库上利用Kaldi进行实验,算法运行结果表明,该算法较当前流行的基于I-Vector的单一梅尔频率倒谱系数和感知线性预测系数的特征系统在等错误率上分别提高了8.18%和1.71%,在模型训练时间上分别减少了60.4%和47.5%,具有更好的识别效果和效率.

关键词: 说话人识别算法, 多特征I-Vector, 主成分分析, 概率线性判别分析, Kaldi

Abstract: Aiming at the problem of inaccurate and inefficient speaker recognition presented by single acoustic feature, a speaker recognition algorithm was proposed based on multi-featured I-Vector. Firstly,different acoustic feature vectors were extracted and combined into a high-dimensional feature vector.Then principal components analysis (PCA) was used to effectively remove the correlation of these feature vectors, so that the features became orthogonalized. Finally, probabilistic linear discriminant analysis(PLDA) was used for modeling and scoring, which led to reduce the spatial dimension to a certain degree.Experiments were carried out on TIMIT corpus in combination with Kaldi speech recognition toolkit, and the results compared with the single-featured systems including Mel-frequency cepstral coefficients (MFCC) and perceptual linear predictive (PLP) coefficients based on I-Vector, the equal error rate (EER) of the purposed algorithm were increased by 8.18%and 1.71%, respectively;the model training time were decreased respectively by 60.4% and 47.5%,respectively.Therefore, the purposed algorithm has betterspeaker recognition performance and efficiency.

Key words: speaker recognition algorithm, multi-featured I-Vector, principal components analysis, probabilistic linear discriminant analysis, Kaldi speech recognition toolkit

中图分类号: