把SingleUtteranceNnet2Decoder换成兼容nnet1模型使用

问

Hey,

I have a trained nnet1 model with me.
I am currently trying to use that model in the new online decoding implementation (online2 & online2bin).
Since nnet1 and nnet2 models are structurally different, simmply passing nnet1 model instead of nnet2 doesn't work.

Can you suggest me how to do this?

The decoder (SingleUtteranceNnet2Decoder) uses a nnet2::DecodableNnet2Online decodable, so I am thinking maybe using a nnet1 compatible decodable might solve this.
Another point I am wondering is that the new online implementation uses OnlineNnet2FeaturePipeline. Going through its source code, it doesn't looks like it would break if nnet1 model is used instead of nnet2. But i ain't sure.

我有一个训练有素的nnet1模型。
我目前正在尝试在新的在线解码实现（online2和online2bin）中使用该模型。
由于nnet1和nnet2模型在结构上不同，因此简单地传递nnet1模型而不是nnet2是无效的。

你能建议我怎么做吗？

解码器（SingleUtteranceNnet2Decoder）使用nnet2 :: DecodableNnet2Online可解码，因此我想也许使用兼容nnet1的可解码可解决此问题。
我想知道的另一点是，新的在线实现使用OnlineNnet2FeaturePipeline。查看其源代码，如果使用nnet1模型而不是nnet2，它看起来不会崩溃。但是我不确定。

答

The way to do this is to convert the model.
There is an example script in the RM setup, somewhere in egs/rm/s5/local/online, that demonstrates this.执行此操作的方法是转换模型。
RM设置中有一个示例脚本，例如
egs / rm / s5 / local / online中的示例脚本。

答

Thanks for the quick response.
I was successfully able to do the conversion of nnet1 to nnet2.

In my case (for implementing online decoding), for feature extraction, I don't want to use only the raw mfcc features. I want to do CMVN computation, splicing and transformation as well.
Since the sample example (online2-wav-nnet2-latgen-faster) only uses raw mfcc, I am currently modifying online-nnet2-feature-pipeline to include these other feature extraction steps as well. For this, I am using the classes defined in online-feature.h and then using "OnlineAppendFeature" to append all the features.

So, I wanted to ask three questions:
1.) Am I on the right path ?
2.) Is the usage of these other online based feature extraction steps already demonstrated in any of the examples? (I tried to find, but couldn't)
3.) From the performance perspective, is there any specific reason to use only raw mfcc in online2-wav-nnet2-latgen-faster? (Apart from ivectors, which I don't want to use in my case)

我成功地完成了从nnet1到nnet2的转换。

就我而言（用于实现在线解码），特征提取，我不想仅使用原始的mfcc特征。我也想进行CMVN计算，拼接和转换。
由于示例示例（online2-wav-nnet2-latgen-faster）仅使用原始mfcc，因此我目前正在修改online-nnet2-feature-pipeline以包括这些其他特征提取步骤。为此，我要使用online-feature.h中定义的类，然后使用“ OnlineAppendFeature”附加所有功能。

因此，我想问三个问题：
1.）我在正确的道路上吗？
2.）是否已在任何示例中演示了这些其他基于在线的特征提取步骤的用法？（我试图找到，但是找不到）
3.）从性能角度来看，是否有任何特定的原因在online2-wav-nnet2-latgen-faster中仅使用原始mfcc？（除了ivector，我不想在我的情况下使用它）

答

If you want to do CMVN, then you would need to do online CMVN if you want to do online decoding. But this would require you to train using online
CMVN, because it will be significantly different from per-side CMVN.

The decision to use raw MFCC in online2-wav-nnet2-latgen-faster is to enable online decoding without the hassles of online CMVN (however, it does
use online CMVN for the Gaussian posterior computation in the iVector extractor).

If what you really want to do is online decoding, and to get good ASR results, your best bet is to use the online-nnet2 setup. I think you are
getting into a much bigger project than you realize, by trying to do it
yourself.

如果要执行CMVN，则如果要进行在线解码，则需要进行在线CMVN 。但是，这将要求您使用在线CMVN 进行培训，因为它与每端CMVN都有很大不同。

在online2-wav-nnet2-latgen-faster中使用原始MFCC的决定是为了实现在线解码，而不会带来在线CMVN的麻烦（但是，它确实将在线CMVN用于iVector
提取器中的高斯后验计算）。

如果您真正想做的是在线解码，并且要获得良好的ASR结果，那么最好的选择是使用online-nnet2设置。我认为您正在尝试自己做一个比您意识到的更大的项目。

友情链接

汕头招聘网 | 山东招聘网 | 郑州教育培训 | 软件下载