精通
英语
和
开源
,
擅长
开发
与
培训
,
胸怀四海
第一信赖
服务方向
联系方式
I'm trying to build a model for Czech language on our training data but I recently hit a wall in an error of mkgraph. Everything till that seems to work without any real problem.
Here's the log:我正在尝试根据我们的训练数据构建捷克语模型,但最近我遇到了一个错误,即mkgraph错误。直到那一切似乎都没有任何实际问题。这是日志:
fstminimizeencoded
fstdeterminizestar --use-log=true
fsttablecompose data/lang/dict100k_test/L_disambig.fst data/lang/dict100k_test/G.fst
fstisstochastic data/lang/dict100k_test/tmp/LG.fst
0.665573 -1.268
[info]: LG not stochastic.
fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang/dict100k_test/phones/disambig.int --write-disambig-syms=data/lang/dict100k_test/tmp/disambig_ilabels_1_0.int data/lang/dict100k_test/tmp/ilabels_1_0
fstisstochastic data/lang/dict100k_test/tmp/CLG_1_0.fst
0.665573 -1.268
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=/media/cz1shark1/056ADF990F8A4A1F/kaldi/exp/mono/trainFull/graph_bg_100k/disambig_tid.int --transition-scale=1.0 data/lang/dict100k_test/tmp/ilabels_1_0 /media/cz1shark1/056ADF990F8A4A1F/kaldi/exp/mono/trainFull/tree /media/cz1shark1/056ADF990F8A4A1F/kaldi/exp/mono/trainFull/final.mdl
fstminimizeencoded
fstrmsymbols /media/cz1shark1/056ADF990F8A4A1F/kaldi/exp/mono/trainFull/graph_bg_100k/disambig_tid.int
fstrmepslocal
fsttablecompose /media/cz1shark1/056ADF990F8A4A1F/kaldi/exp/mono/trainFull/graph_bg_100k/Ha.fst data/lang/dict100k_test/tmp/CLG_1_0.fst
fstdeterminizestar --use-log=true
ERROR: FstHeader::Read: Bad FST header: -
ERROR (fstdeterminizestar:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
ERROR (fstdeterminizestar:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
fst::ReadFstKaldi(std::string)
fstdeterminizestar(main+0x37c) [0x4a4d05]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fc69c8caec5]
fstdeterminizestar() [0x4a48a9]
ERROR: FstHeader::Read: Bad FST header: -
ERROR (fstrmsymbols:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
ERROR (fstrmsymbols:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
fst::ReadFstKaldi(std::string)
fstrmsymbols(main+0x1f8) [0x443c7e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f0bdb166ec5]
fstrmsymbols() [0x4439b9]
ERROR: FstHeader::Read: Bad FST header: -
ERROR (fstrmepslocal:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
ERROR (fstrmepslocal:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
fst::ReadFstKaldi(std::string)
fstrmepslocal(main+0x27b) [0x4532f1]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f0a3afafec5]
fstrmepslocal() [0x452fa9]
ERROR: FstHeader::Read: Bad FST header: -
ERROR (fstminimizeencoded:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
ERROR (fstminimizeencoded:ReadFstKaldi():fstext/fstext-utils-inl.h:1184) Reading FST: error reading FST header from standard input
[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
fst::ReadFstKaldi(std::string)
fstminimizeencoded(main+0x1bf) [0x472465]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f09f94d9ec5]
fstminimizeencoded() [0x4721d9]
This kind of thing happens when you run out of memory - that might be thecause (although it would usually print "std::bad_alloc" to the screen).
Run utils/validate_lang.sh on the lang directory to make sure there are noother obvious problems.You could monitor 'top' while running mkgraph to see if any processes are
getting large.
It might help to modify the script so that temporary files are used instead of pipes, or to use a smaller G.fst.
当您的内存不足时,会发生这种情况-这可能是因为(尽管通常会在屏幕上显示“ std :: bad_alloc”)。
在lang目录中运行utils/validate_lang.sh以确保没有其他明显的问题。您可以在运行mkgraph时监视“top”,以查看是否有任何进程越来越大。注:我机器上没找到validate_lang.sh文件,但有validate_lang.pl,如果安装了perl,可以直接运行。
这可能有助于修改脚本,以便使用临时文件代替管道,或者使用较小的G.fst。
I was indeed running out of a memory. I tried to use the temporary files but it did not help. I ended up pruning the lm.
我的内存确实耗尽了。我尝试使用临时文件,但没有帮助。我最终修剪了lm。