精通
英语
和
开源
,
擅长
开发
与
培训
,
胸怀四海
第一信赖
服务方向
联系方式
最近被FST巨大图搞的头疼,耗时很长,不能生成。搜索本文觉得有帮助,里面讲的工具和重组方式很好,但放到项目里需要环境,没有环境演练,印象估计还是不深入。另外劝大家多培养基础,看看http://kaldi-asr.org/doc/graph.html
thanks for sharing the code! I am wondering whether you could share how you make the HCL and G graphs. I guess the most important part is how you do determinization and when. I found that it affects the decoded results a lot. I made the on-the-fly composition working using your table compose; however, the RTF for the on-the-fly composition HCL and G vs. statistically composed HCLG is significantly worse on my graph, RTF of 0.48 vs RTF 0.28.Here is how I prepare the HCL graph:感谢您分享代码!我想知道您是否可以分享如何制作HCL和G图。我想最重要的部分是确定方法以及确定时间。我发现它对解码结果的影响很大。我使用您的表格撰写功能来进行即时合成;但是,即时生成的HCL和G的RTF与统计组成的HCLG相比,在我的图表上明显更差,RTF为0.48 vs RTF 0.28。以下是我准备HCL图表的方法:
fstdeterminize ${lang}/L_disambig.fst | fstarcsort --sort_type=ilabel > ${dir}/det.L.fst fstcomposecontext \ --context-size=$N --central-position=$P \ --read-disambig-syms=${lang}/phones/disambig.int \ --write-disambig-syms=${lang}/disambig_ilabels_${N}_${P}.int \ ${dir}/ilabels_${N}_${P} ${dir}/det.L.fst | fstarcsort > ${dir}/CL.fst make-h-transducer \ --disambig-syms-out=${dir}/h.disambig.int \ --transition-scale=$tscale \ ${dir}/ilabels_${N}_${P} \ ${tree} \ ${model} > ${dir}/Ha.fst fstconvert \ --fst_type=olabel_lookahead \ --save_relabel_opairs=${dir}/cl.irelabel ${dir}/det.Ha.fst | fstarcsort --sort_type=olabel > ${dir}/la.Ha.fst fstrelabel --relabel_ipairs=${dir}/cl.irelabel ${dir}/CL.fst | \ fstarcsort --sort_type=ilabel | \ fstcompose ${dir}/la.Ha.fst - > ${dir}/det.HaCL.fst fstrmsymbols ${dir}/h.disambig.int ${dir}/det.HaCL.fst | \ fstrmepslocal | \ add-self-loops --self-loop-scale=$loopscale --reorder=true ${model} - | fstarcsort --sort_type=olabel > ${dir}/HCL.fst
I am also wondering whether you tried to use LabelLookAheadMatcher instead of ArcLookAheadMatcher at kaldi-decoders/bin/latgen-lazylm-faster-mapped.cc Line 41 in af1efd2 typedef ArcLookAheadMatcher<SM> LA_SM;I cannot make it work.我还想知道您是否尝试在kaldi-decoders/bin/latgen-lazylm-faster-mapped.cc af1efd2 typedef ArcLookAheadMatcher<SM> LA_SM中的第41行使用LabelLookAheadMatcher而不是ArcLookAheadMatcher。
Thanks for your interest. If you are able to compose HCLG statically, it is going to be faster than the dynamic composition, for sure. The latgen-lazylm-faster-mapped tool was created when composing it is unfeasible due to the huge graph that it would produce (think of very high order language models like 10-gram language models). If you are able to create HCLG statically, you should.The accuracy results should be very similar, at least I always obtained the same results in my experiments when comparing the static HCLG vs. on-the-fly HCL and G composition, which this tool does.You can check this script which creates the HCL and G transducers for a handwritten text recognition experiment:感谢您的关注。如果可以静态组成HCLG,那么它肯定会比动态组成快。生成latgen-lazylm-faster-mapped工具时,由于它会产生巨大的图形而无法使用(例如10元语言模型等非常高级的语言模型),因此创建该工具不可行。如果您能够静态创建HCLG,则精度结果应该非常相似,至少在比较静态HCLG与动态HCL和G成分时,我在实验中始终获得相同的结果,这您可以检查以下脚本,该脚本为手写文本识别实验创建HCL和G转换器:
https://github.com/jpuigcerver/Laia/blob/87ed2e7157879ede84012c5ba6fae12f8afb5f42/egs/iam/utils/build_word_fsts.sh. Your steps to create HCL seem correct. You follow some steps that I don't, but I guess that it is because you have to deal with triphones and other particularities of ASR.I could never make LabelLookAheadMatcher work either, but that's probably because of my then-limited knowledge about Kaldi. At some point, I should try again to make it work, but it will take time.https://github.com/jpuigcerver/Laia/blob/87ed2e7157879ede84012c5ba6fae12f8afb5f42/egs/iam/utils/build_word_fsts.sh。您创建HCL的步骤似乎正确。您遵循一些我没有做的步骤,但是我想这是因为您必须处理三音和ASR的其他特殊性,我也无法使LabelLookAheadMatcher正常工作,但这可能是因为我当时对Kaldi的了解有限。在某个时候,我应该再次尝试使其工作,但是这需要时间。
Nice work.. Joan, I should have talked to you a few weeks ago when I started implementing dynamic decoding for ASR.. I ended up doing pretty much what you had already done :p.You may relabel the whole HCL at the end of the graph creation Don't forget to relabel your G.fst before composing And by the way, OpenFST should be able to choose the best filters depending of the properties of the FST (see the end of this file :Regarding the RTF, I had some issues too, and profiling it showed that it was mainly due to the weight pushing, but also to some extent to the on-demand composition that cannot come for free.做得好..琼,我应该在几周前开始为ASR实现动态解码时与您交谈。.我最终做了几乎已经完成的事情:p。您可能会在整个HCL结束时重新标记整个HCL。
fstconvert --fst_type=olabel_lookahead --save_relabel_opairs=$graph_dir/g.irelabel $graph_dir/HCL.fst >$graph_dir/left.fst
图形创建不要忘了在撰写之前重新标记G.fst。
fstrelabel --relabel_ipairs=$graph_dir/g.irelabel $graph_dir/G.fst | fstarcsort >$graph_dir/right.fst
顺便说一句,OpenFST应该能够根据FST的属性选择最佳的过滤器(请参阅此文件的末尾:
// Specializes for StdArc to allow weight and label pushing.
template <>
class DefaultLookAhead<StdArc, MATCH_OUTPUT> {
public:
using M = LookAheadMatcher<Fst<StdArc>>;
using SF = AltSequenceComposeFilter<M>;
using LF = LookAheadComposeFilter<SF, M>;
using WF = PushWeightsComposeFilter<LF, M>;
using ComposeFilter = PushLabelsComposeFilter<WF, M>;
using FstMatcher = M;
};
关于RTF,我已经也有一些问题,对其进行概要分析表明,这主要是由于权重增加,但在某种程度上还取决于无法免费获得的按需组合。
Your response was very helpful. At the end, I managed to get LabelLookAheadMatcher to work. It is mostly based on the code and examples in opendcd, e.g. https://github.com/opendcd/opendcd/blob/master/script/makegraphotf.sh. Here is how I build and prepare the HCL and G. In fact, my final solution is very similar to what was suggested by @tbluche.您的回复非常有帮助。最后,我设法让LabelLookAheadMatcher正常工作。它主要基于opendcd中的代码和示例,例如https://github.com/opendcd/opendcd/blob/master/script/makegraphotf.sh。这是我构建和准备HCL和G的方法。实际上,我的最终解决方案与@tbluche提出的解决方案非常相似。
#--------------- fstdeterminize ${lang}/L_disambig.fst | fstarcsort > ${dir}/det.L.fst #--------------- fstcomposecontext \ --context-size=$N --central-position=$P \ --read-disambig-syms=${lang}/phones/disambig.int \ --write-disambig-syms=${lang}/disambig_ilabels_${N}_${P}.int \ ${dir}/ilabels_${N}_${P} ${dir}/det.L.fst | \ fstarcsort > ${dir}/CL.fst #--------------- make-h-transducer \ --disambig-syms-out=${dir}/h.disambig.int \ --transition-scale=$tscale \ ${dir}/ilabels_${N}_${P} \ ${tree} \ ${model} > ${dir}/Ha.fst cat ${dir}/Ha.fst > ${dir}/det.Ha.fst #--------------- fstconvert \ --fst_type=ilabel_lookahead \ --save_relabel_ipairs=${dir}/h.orelabel ${dir}/CL.fst | fstarcsort --sort_type=ilabel > ${dir}/la.CL.fst fstrelabel --relabel_opairs=${dir}/h.orelabel ${dir}/det.Ha.fst | \ fstarcsort --sort_type=olabel | \ fstcompose - ${dir}/la.CL.fst > ${dir}/det.HaCL.fst #--------------- fstdeterminize ${dir}/det.HaCL.fst | \ fstrmsymbols ${dir}/h.disambig.int | \ fstrmepslocal | \ fstpushspecial | \ fstminimizeencoded | \ add-self-loops --self-loop-scale=$loopscale --reorder=true ${model} - | fstarcsort --sort_type=olabel | fstconvert --fst_type=const > ${dir}/HCL.fst #----------------------------- fstconvert --fst_type=olabel_lookahead --save_relabel_opairs=${dir}/g.irelabel ${dir}/HCL.fst > ${dir}/HCLr.fst fstrelabel --relabel_ipairs=${dir}/g.irelabel ${lang}/G.fst | \ fstarcsort | fstconvert --fst_type=const > ${dir}/Gr.fst fstcompose ${dir}/HCLr.fst ${dir}/Gr.fst | \ fstconvert --fst_type=const > ${dir}/HCLrGr.fst
The composed HCLG for KALDI decoder is created as follows:组成KALDI解码器的HCLG的过程如下:
ComposeFst<StdArc>* OTFComposeFst(
const StdFst &ifst1, const StdFst &ifst2,
const CacheOptions& cache_opts = CacheOptions()) { typedef LookAheadMatcher< StdFst > M;
typedef AltSequenceComposeFilter<M> SF;
typedef LookAheadComposeFilter<SF, M> LF;
typedef PushWeightsComposeFilter<LF, M> WF;
typedef PushLabelsComposeFilter<WF, M> ComposeFilter;
typedef M FstMatcher;
ComposeFstOptions<StdArc, FstMatcher, ComposeFilter> opts(cache_opts); return new ComposeFst<StdArc>(ifst1, ifst2, opts);
}
My observation is that when I want the same WER then I must lower pruning for OTF composed HCL and G. This results in about 20 % increase in RTF. If I fix RTF then my WER is about 20 % relatively worse for OTF composed HCL and G. So, there is some cost of OTF composition though it is not that bad. It is usable.我的观察是,当我想要相同的WER时,必须降低OTF组成的HCL和G的修剪。这将导致RTF大约增加20%。如果我修复RTF,那么OTF组成的HCL和G的WER相对要差20%。因此,OTF组成有一些成本,尽管还算不错。可以使用。
Please note that preparation of the HCL and G is a bit different from the one in https://github.com/opendcd/opendcd/blob/master/script/makegraphotf.sh . For example, I could not determinize Ha.fst as it appeared to be non-functional. Also, the determinization of L is important, otherwise the final HCL graph will not be "small enough" and therefore the on the OTF composition that efficient.请注意,HCL和G的准备与https://github.com/opendcd/opendcd/blob/master/script/makegraphotf.sh中的准备有点不同。例如,我无法确定Ha.fst,因为它似乎无法正常工作。同样,L的确定很重要,否则最终的HCL图将不会“足够小”,因此对OTF组成有效。
I have built the HaCL.fst. But ,the command "fstdeterminize --use-log=true HaCL.fst det.HaCL.fst" will be fail, because of the existence of loop, such as "self-loop". How can I solve it?我已经建立了HaCL.fst。但是,由于存在循环(例如“自循环”),因此命令“ fstdeterminize --use-log = true HaCL.fst det.HaCL.fst”将失败。我该如何解决?
I too get stuck with fstdeterminize ${dir}/det.HaCL.fst not terminating.Make sure that you have replaced the epsilons in the input of your LM fst (G.fst) with a disambiguation symbol (i.e. #0), as well as to have produce a deterministic lexicon, also by introducing the appropriate disambiguation symbols.Please, refer to Kaldi's documentation to understand how the WFST construction works: http://kaldi-asr.org/doc/graph.html我也陷入了fstdeterminize $ {dir} /det.HaCL.fst没有终止的情况。请确保已将LM fst(G.fst)输入中的epsilons替换为歧义符号(例如,#0),例如并通过引入适当的歧义符号来生成确定性词典。请参阅Kaldi的文档以了解WFST构造的工作原理:http://kaldi-asr.org/doc/graph.html
Unlike NFAs, which can always be determinized, not all WFST have an equivalent deterministic WFST. Because, it does not meet the twins property of determinization. you all could refer to "An Approximate Determinization Algorithm for Weighted Finite-State Automata".与可以始终确定的NFA不同,并非所有WFST都具有同等的确定性WFST。因为,它不满足确定的twins属性。大家都可以参考“加权有限状态自动机的近似确定算法”。
Thanks for the info. I'm quite happy to have been able to get it to work if I skip the step to determinize HaCL.fst. Now I'm just curious how @jurcicek apparently was able to determinize HaCL.fst in Kaldi, but the nondeterminism isn't a showstopper for me. By the way, thanks for the interesting paper link谢谢(你的)信息。如果跳过确定HaCL.fst的步骤,我很高兴能够使其正常工作。现在,我很好奇@jurcicek显然能够确定Kaldi中的HaCL.fst,但是不确定性对我而言并不是一件容易的事。顺便说一句,谢谢你有趣的论文链接
It is true that, in general, not all WFST admit determinization.However, if you are using traditional models for G (n-grams) and L, and you have introduced the appropriate disambiguation symbols, the resulting WFST does admit determinization (please, refer to Kaldi's documentation).通常,并非所有WFST都承认确定性,但是,如果您对G(n-grams)和L使用传统模型,并且引入了适当的消歧符号,则所得WFST确实允许确定性(请注意,请参阅Kaldi的文档)。
I am using a simple non-ngram G, but that shouldn't be able to affect the determinization of HaCL.fst (there's no G there yet!). I think I'm using a traditional L and have introduced the appropriate disambiguation symbols (I'm following the recipe from @jurcicek). However, I've even tried a toy L (below), and its HaCL.fst also failed to determinize.我使用的是简单的非ngram G,但是那不应该影响HaCL.fst的确定(那里还没有G!)。我想我使用的是传统的L,并引入了适当的消歧符号(我遵循@jurcicek的方案)。但是,我什至尝试过toy L(下图),它的HaCL.fst也无法确定。
!SIL SIL <UNK> SPN ache EY K
Not a big deal, just curious.没什么大不了的,只是好奇。
when I run the command "fstconvert --fst_type=olabel_lookahead --save_relabel_opairs=${dir}/g.irelabel ${dir}/HCL.fst > ${dir}/HCLr.fst", error "FATAL: IntervalReachVisitor: cyclic input" occurs. How can I solve it ?当我运行命令“ fstconvert --fst_type = olabel_lookahead --save_relabel_opairs = $ {dir} /g.irelabel $ {dir} /HCL.fst> $ {dir} /HCLr.fst”时,出现错误“ FATAL:FATAL:IntervalReachVisitor:循环输入”。我该如何解决?