政企荣誉
- 亚马逊语音识别合作
意向种子企业,小语种方向 - 政府推荐参加资本力量
1+6融资活动 - 上市公司众为兴
合作伙伴 - 河南职教中心
成人学历和能力培训合作联盟成员
Kaldi是最全面的语音识别开源平台,开源平台对于一些开源的总结好的数据好处理,但是对于自定义的数据,会有kaldi错误,特别是kaldi训练错误和kaldi运行错误。
下面就是训练错误:
steps/make_mfcc.sh: Succeeded creating MFCC features for train_si284_sp steps/compute_cmvn_stats.sh data/train_si284_sp Succeeded creating CMVN stats for train_si284_sp local/nnet3/run_ivector_common.sh: fixing input data-dir to remove nonexistent features, in case some .. speed-perturbed segments were too short. fix_data_dir.sh: kept all 14997 utterances. fix_data_dir.sh: old files are kept in data/train_si284_sp/.backup local/nnet3/run_ivector_common.sh: alignments in exp/tri4b_ali_train_si284_sp appear to already exist. Please either remove them ... or use a later --stage option.
这类错误看提示就能解决,但是另外的kaldi训练错误,如下:
steps/train_mono.sh --boost-silence 1.25 --nj 4 --cmd run.pl data/train_si84_2kshort data/lang_nosp exp/mono0a steps/train_mono.sh: Initializing monophone system. feat-to-dim 'ark,s,cs:apply-cmvn --utt2spk=ark:data/train_si84_2kshort/split4/1/utt2spk scp:data/train_si84_2kshort/split4/1/cmvn.scp scp:data/train_si84_2kshort/split4/1/feats.scp ark:- | add-deltas ark:- ark:- |' - apply-cmvn --utt2spk=ark:data/train_si84_2kshort/split4/1/utt2spk scp:data/train_si84_2kshort/split4/1/cmvn.scp scp:data/train_si84_2kshort/split4/1/feats.scp ark:- WARNING (apply-cmvn[5.5.839~8-0c6a]:ReadScriptFile():kaldi-table.cc:34) Error opening script file: data/train_si84_2kshort/split4/1/cmvn.scp ERROR (apply-cmvn[5.5.839~8-0c6a]:RandomAccessTableReader():util/kaldi-table-inl.h:2512) Error opening RandomAccessTableReader object (rspecifier is: scp:data/train_si84_2kshort/split4/1/cmvn.scp) [ Stack-Trace: ] /home/server/kaldi-trunk/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0xb42) [0x7fe1cf528692] apply-cmvn(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x55faa22a7f4f] apply-cmvn(kaldi::RandomAccessTableReader<kaldi::KaldiObjectHolder<kaldi::Matrix<double> > >::RandomAccessTableReader(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xd3) [0x55faa22b054d]
apply-cmvn(kaldi::RandomAccessTableReaderMapped<kaldi::KaldiObjectHolder<kaldi::Matrix<double> > >::RandomAccessTableReaderMapped(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x2c) [0x55faa22b1df2]apply-cmvn(main+0x8a7) [0x55faa22a5ab1] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7fe1ce992bf7] apply-cmvn(_start+0x2a) [0x55faa22a512a] kaldi::KaldiFatalErroradd-deltas ark:- ark:- ERROR (feat-to-dim[5.5.839~8-0c6a]:main():feat-to-dim.cc:58) Could not read any features (empty archive?) [ Stack-Trace: ] /home/server/kaldi-trunk/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0xb42) [0x7fa2caba5692] feat-to-dim(kaldi::MessageLogger::LogAndThrow::operator=(kaldi::MessageLogger const&)+0x21) [0x558d99e2b401] feat-to-dim(main+0x2e9) [0x558d99e2a7f3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7fa2ca00fbf7] feat-to-dim(_start+0x2a) [0x558d99e2a42a] kaldi::KaldiFatalErrorerror getting feature dimension
这样的就不好解决了,虽然是少文件,但是文件性质,文件出处,文件目的,这些不知道,就不好解决。开源一般是验证主功能,附加的处理比较少,对初学者不友好。
kaldi里有识别服务器代码,在识别时,也会有错误,比如:
ASSERTION_FAILED (extend-wav-with-silence[5.5.920~1-b22b8]:SubVector():matrix/kaldi-vector.h:512) Assertion failed: (static_cast(origin)+ static_cast (length) <= static_cast (t.Dim())) [ Stack-Trace: ] extend-wav-with-silence(kaldi::MessageLogger::LogMessage() const+0xb42) [0x55b14bbf1aca] extend-wav-with-silence(kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+0x6e) [0x55b14bbf27c6] extend-wav-with-silence(kaldi::FindQuietestSegment(kaldi::Vector const&, float, kaldi::Vector *, float, float, float)+0x1a5) [0x55b14bb6b5af] extend-wav-with-silence(kaldi::ExtendWaveWithSilence(kaldi::Vector const&, float, kaldi::Vector *, float, float, float)+0x3f) [0x55b14bb6b97a] extend-wav-with-silence(main+0x542) [0x55b14bb6c2a5] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f1ee0ffeb97] extend-wav-with-silence(_start+0x2a) [0x55b14bb6b32a] Aborted (core dumped) AMI_ES2011a_H00_FEE041_0008364_0008924 I FORGOT TO SAY I'M THE PROJECT MANAGER BUT I FIGURED JOHN KNEW THAT ALREADY LOG (online2-wav-nnet3-latgen-faster[5.5.920~1-b22b8]:main():online2-wav-nnet3-latgen-faster.cc:296) Decoded utterance AMI_ES2011a_H00_FEE041_0008364_0008924 WARNING (online2-wav-nnet3-latgen-faster[5.5.920~1-b22b8]:main():online2-wav-nnet3-latgen-faster.cc:210) Did not find audio for utterance AMI_ES2011a_H00_FEE041_0009602_0009635 WARNING (online2-wav-nnet3-latgen-faster[5.5.920~1-b22b8]:main():online2-wav-nnet3-latgen-faster.cc:210) Did not find audio for utterance AMI_ES2011a_H00_FEE041_0009826_0010223
这类断言错误,就要掌握代码细节才能解决。
但这些错误,如果开始遇到容易的,让开发人员都有过渡也好解决,开始就遇到难的错误,开发人员傻眼了就不好解决。这类错误需要C++和语音识别数据结构经验丰富的高手才能解决,这方面锐英源软件有经验,锐英源软件在高端GPU服务器上训练了上百套数据,几万次训练,中间经历的奇怪问题太多了。
锐英源软件的一站式语音识别平台入选了郑州双创大会,锐英源软件在人工智能语音识别机器学习深度学习方面实战经验丰富,欢迎各类合作。