精通
英语
和
开源
,
擅长
开发
与
培训
,
胸怀四海
第一信赖
手机:138-0381-0136
Q Q:396806883
微信:ryysoft
Hi! I try to create dnn model according to run_dnn.sh script.I have created fMMLR feats. After that I ran command: 嗨!我尝试根据run_dnn.sh脚本创建dnn模型。我创建了fMMLR特征。之后,我运行命令:
steps/nnet/pretrain_dbn.sh --rbm-iter 3 $data_fmllr/train $dir
Train set containes 73h Voxforge data. I'm using Mint 17 in Virtual Box. So, i think its main reason that script train only 3 layers for week of computing.I have no that kind of problems in creating models from Voxforge recipe.
I tried to change parameters in script for using CPU vice GPU:训练集包含73h Voxforge数据。我在Virtual Box中使用Mint 17。因此,我认为其主要原因是脚本只需要训练3层即可进行一周的计算。
我试图更改脚本中的参数以使用CPU副GPU:
nnet-forward --use-gpu=no
But it is still too slow.
How can i speed up the computing?And where i can find a mistake我如何加快计算速度?在哪里可以找到错误?
If you don't have GPUs, there is not much point trying this, it will be too slow. The nnet2 setup is faster if you have a lot of CPUs, as it supports multi-threaded and multi-machine training- but still it's best if you have GPUs.如果您没有GPU,那么尝试这样做没有多大意义,那就 太慢了。 如果您有很多CPU,则nnet2设置会更快,因为它支持 多线程和多机器培训-但是,如果您有 GPU,它仍然是最好的。
There are some tricks how to get the GPU (if you have it) into the virtual
machine. It wasn't straightforward (last time I checked) but it was
possible.Perhaps someone on the list running GPU in a virtual machine could give
some advice? 有一些技巧可以把GPU(如果有)放入虚拟机。这不是很简单(我上次检查),但是有可能,也许邮件列表中在虚拟机上运行GPU的人可以给出建议
Now, I trying to run run_5d.sh from nnet2. For using CPU, I change parameters cmd.sh (train_cmd = run.pl , decode_cmd = run.pl).Everything is OK, but my virtual machine crash, when njobs more than 4 at the get egs stage (big load on disk). I tried to rerun train_pnorm_fast with parameters (train_stage = -2, and njobs = 8) ,when the all previous steps have been performed with 4 jobs. But have an error : >>run.pl: 4/8 Failed. Log file :"Error constructing table reader". So, can I use more jobs for train stage? 现在,我尝试从nnet2运行run_5d.sh。\r\n 为了使用CPU,我更改了参数cmd.sh(train_cmd = run.pl,decode_cmd = run.pl),一切都很好,但是当在get egs阶段njobs大于4时,我的虚拟机崩溃了(磁盘上的大负载) 。\r\n 当所有先前的步骤都用4个作业执行时,我尝试使用参数(train_stage = -2和njobs = 8)重新运行train_pnorm_fast。\r\n 但是有一个错误:>> run.pl:4/8失败。\r\n 日志文件:“构造表读取器时出错”。\r\n 那么,我可以在训练上使用更多工作吗?
That script is kind of out of date. train_pnorm_simple2.sh will spend less time dumping egs. But make sure your Kaldi is up to date. By default get_egs.sh gives the option "-tc 5" while getting the egs, so no more than 5 jobs can run, but that option has only since recently been supported by run.pl (previously it was ignored). 该脚本有点过时了。 train_pnorm_simple2.sh将花费更少的时间转储egs。但是,请确保您的Kaldi是最新的。缺省情况下,get_egs.sh在获取egs时给出选项“ -tc 5”,因此最多可以运行5个作业,但是该选项直到最近才由run.pl支持(以前被忽略)。
After svn update I have another problem. svn更新后,我还有另一个问题。
Command make depend -j 8 succeeds. But at make stage I got Errors: 命令makedepend -j 8成功。但是在make阶段我遇到了错误:
/home/pittman/kaldi-trunk/src/gmmbin/gmm-align.cc:136: undefined reference to `kaldi::AlignUtteranceWrapper(kaldi::AlignConfig const&, std::string const&, float, fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::DecodableInterface, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >, kaldi::TableWriter<kaldi::BasicHolder<float> >, int, int, int, double, long*)'
collect2: error: ld returned 1 exit statussgmm-align-compiled.o: In function main': /home/pittman/kaldi-trunk/src/sgmmbin/sgmm-align-compiled.cc:164: undefined reference tokaldi::AlignUtteranceWrapper(kaldi::AlignConfig const&, std::string const&, float, fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::DecodableInterface, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >, kaldi::TableWriter<kaldi::BasicHolder<float> >, int, int, int, double, long*)'
collect2: error: ld returned 1 exit status
Is it all about gcc version? (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)) 一切都与gcc版本有关吗? (gcc版本4.8.2(Ubuntu 4.8.2-19ubuntu1))
did you do "make clean" after svn update or after "make depend"? 在svn更新之后或在“使依赖”之后,您是否做了“清洁”?
Sorry, but I hastended to conclusions. Did "make clean" command before make depend: 抱歉,但是我已经得出结论了。 在进行make之前是否执行过“ make clean”命令:
align-mapped.o: In function main': /home/pittman/kaldi-trunk/src/bin/align-mapped.cc:129: undefined reference tokaldi::AlignUtteranceWrapper(kaldi::AlignConfig const&, std::string const&, float, fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, kaldi::DecodableInterface, kaldi::TableWriter<kaldi::BasicVectorHolder<int> >, kaldi::TableWriter<kaldi::BasicHolder<float> >, int, int, int, double, long*)'
collect2: error: ld returned 1 exit status
That's odd. The symbol should be defined in decoder/kaldi-decoder.a, it comes from decoder-wrappers.cc in that directory. Try to do "make" in decoder/, then "make" in bin, and make sure decoder/kaldi-decoder.a is on the linking line. I don't understand what's happening. 真奇怪该符号应在解码器/kaldi-decoder.a中定义,它来自该目录中的decoder-wrappers.cc。尝试在解码器/中执行“ make”,然后在bin中执行“ make”,并确保解码器/kaldi-decoder.a在链接线上。我不明白发生了什么。
公司注册号:410105000449586 豫ICP备08007559号 最佳分辨率 1024*768
地址:郑州市文化路47号院1号楼4层(47-1楼位于文化路和红专路十字路口东北角,郑州大学工学院招待所南边,工学院科技报告厅西边。)