锐英源软件
第一信赖

精通

英语

开源

擅长

开发

培训

胸怀四海 

第一信赖

当前位置:锐英源 / 开源技术 / 语音识别开源 / DNN在线解码不用CMVN的原因
联系方式
固话:0371-63888850
手机:138-0381-0136
Q Q:396806883
微信:ryysoft

服务方向

人工智能数据处理
人工智能培训
kaldi数据准备
小语种语音识别
语音识别标注
语音识别系统
语音识别转文字
kaldi开发技术服务
软件开发
运动控制卡上位机
机械加工软件
软件开发培训
Java 安卓移动开发
VC++
C#软件
汇编和破解
驱动开发

DNN在线解码不用CMVN的原因


When I traced the online2/online-feature-pipeline.cc source code, I can find the feature extraction procedure in gmm decoder like this (assume no pitch)
OnlineMfcc -> OnlineCmvn -> OnlineSpliceFrames -> OnlineTransform

but in the online2/online-nnet2-feature-pipeline.cc source code, The feature extraction procedure in dnn decoder like this (assume no pitch, no ivector)
OnlineMfcc

My questions are
1.Why not apply online cmvn into the feature extraction procedure in the dnn decoder?
2.How to apply online cmvn into the feature extraction procedure in the dnn decoder?

I ever tried to apply cmvn into the feature extraction in the dnn decoder like gmm style, but this will reduce the accuracy rate.

当我跟踪online2 / online-feature-pipeline.cc源代码时,我可以在gmm解码器中找到这样的特征提取过程(假设没有补丁)
OnlineMfcc-> OnlineCmvn-> OnlineSpliceFrames-> OnlineTransform

但是在online2 / online-nnet2-feature-pipeline.cc源代码中,dnn解码器中的特征提取过程是这样的(假定没有音调,没有ivector)
OnlineMfcc

我的问题是

1. 为什么不将在线cmvn应用于dnn解码器中的特征提取过程?

2.如何将在线cmvn应用于dnn解码器中的特征提取过程?

我曾经尝试过将cmvn应用于gmm样式的dnn解码器中的特征提取,但这会降低准确率。

Because there is i-vector adaptation going on, the idea is for the i-vector to learn any offset of the features, so you don't have to
apply that normalization to the features. Also the test condition needs to be matched to training, so to change this you'd have to
change it in training too (and the training-time feature extraction is done at the script level).

因为正在进行i向量调整,所以我的想法是让i向量学习特征的任何偏移量,因此您不必将该归一化应用于特征。同样,测试条件也需要与训练相匹配,因此要更改此条件,您也必须在训练中进行更改(并且训练时特征提取是在脚本级别进行的)。

Our condition is that the environment condition in training and test speech is different(channel), so that we got much better LVCSR performance using CMVN than that without using CMVN feature extraction.

You mean that we do not need to apply CMVN in DNN decoding if we have a i-vector in the feature? Does it means that LVCSR performance is comparable if we use additional i-vector and CMVN for feature extraction?

我们的条件是训练和测试语音的环境条件不同(信道),因此使用CMVN的LVCSR性能要比不使用CMVN特征提取的LVCSR性能好得多。

您的意思是,如果特征中包含i-vector,则无需在DNN解码中应用CMVN?如果我们使用附加的i-vector和CMVN进行特征提取,是否意味着LVCSR性能可比?


The i-vector method typically works well, but it doesn't always work well if there is a very big train/test mismatch. We found, for
instance, that our models aren't always robust to differences in volume because training data tends to be carefully volume normalized.
In future we'll do volume perturbation during training.

i-vector方法通常可以很好地工作,但是如果训练/测试不匹配很大,那么它并不总是能很好地工作。 例如,我们发现我们的模型并不总是对
音量差异具有鲁棒性,因为训练数据倾向于对音量进行仔细的归一化。将来,我们将在训练期间进行音量微扰。

Sorry, I want to ask another question. You use ivector in dnn and use cmvn in gmm. Why not use ivector in gmm instead of using cmvn in gmm?Sorry,我想问另一个问题。
您在dnn中使用ivector,在gmm中使用cmvn。 为什么不在gmm中使用ivector反而是在gmm中使用cmvn?

GMM classifier is not very good to combine inputs of different type and classify them. It can not learn complex dependency between ivector values and features. It can only approximate well a simple convex distribution and even that task is somewhat complex because of GMM inefficiency. Deep neural networks are way better classifiers of complex functions, they can classify non-convex objects and even learn complex dependencies between features. That's why ivectors can be used within DNN framework in order to augment features.GMM分类器不能很好地组合不同类型的输入并对它们进行分类。它无法学习ivector值和要素之间的复杂依赖关系。由于GMM效率低下,它只能很好地近似简单的凸分布,甚至该任务也有些复杂。

深度神经网络是复杂函数的更好分类器,它们可以对非凸对象进行分类,甚至可以学习特征之间的复杂依赖关系。这就是为什么可以在DNN框架中使用ivector来增强功能的原因。

友情链接
版权所有 Copyright(c)2004-2021 锐英源软件
公司注册号:410105000449586 豫ICP备08007559号 最佳分辨率 1024*768
地址:郑州大学北校区院(文化路97号院)内劳动服务器公司办公楼一层