锐英源软件
第一信赖

精通

英语

开源

擅长

开发

培训

胸怀四海 

第一信赖

当前位置:锐英源 / 开源技术 / 语音识别开源 / espeak参数和回调
联系方式
固话:0371-63888850
手机:138-0381-0136
Q Q:396806883
微信:ryysoft

服务方向

人工智能数据处理
人工智能培训
kaldi数据准备
小语种语音识别
语音识别标注
语音识别系统
语音识别转文字
kaldi开发技术服务
软件开发
运动控制卡上位机
机械加工软件
软件开发培训
Java 安卓移动开发
VC++
C#软件
汇编和破解
驱动开发

espeak参数和回调


总结

提问人好像熟悉其它音频代码,对espeak的模式不熟悉,espeak是把音频串出来,不会进行分析,提问人问的好像是分析的事。espeak回调可以找到输出的缓冲。

1


i am trying to do some functionality with espeak but missing some parameters (i don`t know it) and working on code blocks on Linux the next code runs well and reads Arabic Text 我正在尝试使用espeak做一些功能,但是缺少一些参数(我不知道),并且在Linux上的代码块上工作,下一个代码运行良好并读取阿拉伯文本

#include<string.h> 
   #include<malloc.h>
   #include</usr/local/include/espeak/speak_lib.h>
   int main(int argc, char* argv[] )
{
char text[] = {"الله لطيف "};
espeak_Initialize(AUDIO_OUTPUT_PLAYBACK, 0, NULL, 0 ); 
espeak_SetVoiceByName("ar");
unsigned int size = 0; 
while(text[size]!='\0') size++;
unsigned int flags=espeakCHARS_AUTO | espeakENDPAUSE;
espeak_Synth( text, size+1, 0,POS_CHARACTER,0, flags, NULL, NULL );
espeak_Synchronize( );
return 0;
 }`

now could you help us finding these parameters from Espeak 现在您可以帮助我们从Espeak查找这些参数吗
1.Fuction which return the generated wave to store it in a variable 1.函数返回生成的波并将其存储在变量中
2.Frequency 2.频率
3.number of channels 3.通道数
4.sample size 4.样本量
5.a buffer in which we store samples 5.我们存储样本的缓冲区


3


If you can't find a suitable example, you will have to read the documentation in the header file. Haven't used it, but it looks pretty comprehensible: 如果找不到合适的示例,则必须阅读头文件中的文档。就是尚未使用过,但看起来很容易理解:
http://espeak.sourceforge.net/speak_lib.h http://espeak.sourceforge.net/speak_lib.h
When you called espeak_Initialize you passed in AUDIO_OUTPUT_PLAYBACK. You will need to pass in AUDIO_OUTPUT_RETRIEVAL instead, and then it looks like you must call espeak_SetSynthCallback with a function of your own creation to accept the samples. 调用espeak_Initialize时,您传入了AUDIO_OUTPUT_PLAYBACK。您将需要传递AUDIO_OUTPUT_RETRIEVAL,然后看起来您必须使用自己创建的函数调用espeak_SetSynthCallback以接受样本。
Your adapted code would look something like this (UNTESTED): 您改编的代码看起来像这样(未测试):

#include  <string.h>
#include <vector>
#include </usr/local/include/espeak/speak_lib.h>

int samplerate; // determined by espeak, will be in Hertz (Hz)
const int buflength = 200; // passed to espeak, in milliseconds (ms)

std::vector<short> sounddata;

int SynthCallback(short *wav, int numsamples, espeak_EVENT *events) {
if (wav == NULL)
return 1; // NULL means done.

    /* process your samples here, let's just gather them */
sounddata.insert(sounddata.end(), wav, wav + numsamples);
return 0; // 0 continues synthesis, 1 aborts
}

int main(int argc, char* argv[] ) {
char text[] = {"الله لطيف "};
samplerate = espeak_Initialize(AUDIO_OUTPUT_RETRIEVAL, buflength, NULL, 0); 
espeak_SetSynthCallback(&SynthCallback);
espeak_SetVoiceByName("ar");
unsigned int flags=espeakCHARS_AUTO | espeakENDPAUSE;
size_t size = strlen(text);
espeak_Synth(text, size + 1, 0, POS_CHARACTER, 0, flags, NULL, NULL);
espeak_Synchronize(); /* in theory sounddata holds your samples now... */ return 0; }

So for your questions: 因此,对于您的问题:

  • Function which return the generated wave to store it in a variable - You write a callback function, and that function gets little buflength-long bits of the wav to process. If you are going to accumulate the data into a larger buffer, I've shown how you could do that yourself.
  • Frequency - Through this API it doesn't look like you pick it, espeak does. It's in Hz and returned as samplerate above.
  • Number of Channels - There's no mention of it, and voice synthesis is generally mono, one would think. (Vocals are mixed center by default in most stereo mixes...so you'd take the mono data you got back and play the same synthesized data on left and right channels.)
  • Sample Size - You get shorts. Those are signed integers, 2 bytes, range of -32,768 to 32,767. Probably it uses the entire range, doesn't seem to be configurable, but you could test and see what you get out.
  • A Buffer In Which We Store Samples - The synthesis buffer appears to belong to espeak, which handles the allocation and freeing of it. I've shown an example of using a std::vector to gather chunks from multiple calls.
  • Number of Samples - Each call to your SynthCallback will get a potentially different number of samples. You might get 0 for that number and it might not mean it's at the end.
  • 因此,对于您的问题:

    1. 返回生成的wave并将其存储在变量中的函数-您编写了一个回调函数,该函数只获取buflengthwav-long位以进行处理。如果您打算将数据累积到更大的缓冲区中,那么我已经展示了如何自己做到这一点。

    2. 频率-通过此API,看起来像您不选择它,espeak确实如此。以Hz为单位,并按samplerate上述方式返回。

    3. 通道数-没有人提及,语音合成通常是单声道,有人会想到。(在大多数立体声混音中,默认情况下,人声混音处于中心位置...因此,您将获取返回的单声道数据,并在左右声道上播放相同的合成数据。)

    4. 样本量-您得到shorts。这些是带符号的整数(2个字节),范围为-32,768至32,767。可能它使用了整个范围,似乎不是可配置的,但是您可以测试一下,看看得出什么。

    5. 我们用于存储样本的缓冲区-合成缓冲区似乎属于espeak,它负责分配和释放它。我展示了一个使用std :: vector从多个调用中收集块的示例。

    6. 样本数量-每次致电SynthCallback都会获得可能不同的样本数量。您可能会为该数字获得0,但这并不意味着它在末尾。

  • i have added the following code then cout<<"Freqency :"<<samplerate <<endl; cout<<" Sample Size :"<<sounddata.size() <<endl; cout<<"Number of Samples :"<<(&SynthCallback)<<endl; but Number of Samples always return 1 even when trying different sentences with a different lengths –  我添加了以下代码,然后cout <<“ Freqency:” << samplerate << endl; cout <<“ Sample Size:” << sounddata.size()<< endl; cout <<“样本数:” <<(&SynthCallback)<< endl;但是即使尝试使用不同长度的不同句子,样本数也始终返回1-
  • Well perhaps you should post a new question which shows what complete code you made using my guesses. The code you give here would have you printing "Number of Samples" as a function pointer, so I assume you mean the "Sample Size" is coming back as 1. Without a complete program it's hard to tell, if you did everything like call synchronize, and if you don't put a print message in the callback to know if it's ever being called that's another issue. 好吧,也许您应该发布一个新问题,以显示您使用我的猜测做出的完整代码。您在此处提供的代码将让您打印“ Number of Samples”作为函数指针,因此我认为您的意思是“ Sample Size”将返回为1。如果没有完整的程序,很难分辨出是否进行了调用之类的操作同步,如果您没有在回调中添加打印消息以了解是否曾经被调用过,则这是另一个问题。
友情链接
版权所有 Copyright(c)2004-2021 锐英源软件
公司注册号:410105000449586 豫ICP备08007559号 最佳分辨率 1024*768
地址:郑州大学北校区院(文化路97号院)内劳动服务器公司办公楼一层