C# SpeechSynthesizer的SpeakProgressEventArgs是否不准确?
使用.Net 3.5中的System.Speech.Synthesis.SpeechSynthesizer类,SpeakProgressEventArgs的AudioPosition属性似乎不准确 以下代码生成以下输出: 代码: 但是,生成的.wav文件的持续时间为15.69秒。如果输出到流或null,则会发生相同的行为 属性的表示该属性是“一个TimeSpan对象,表示事件在音频输出流中的时间位置”C# SpeechSynthesizer的SpeakProgressEventArgs是否不准确?,c#,.net-3.5,text-to-speech,speechsynthesizer,C#,.net 3.5,Text To Speech,Speechsynthesizer,使用.Net 3.5中的System.Speech.Synthesis.SpeechSynthesizer类,SpeakProgressEventArgs的AudioPosition属性似乎不准确 以下代码生成以下输出: 代码: 但是,生成的.wav文件的持续时间为15.69秒。如果输出到流或null,则会发生相同的行为 属性的表示该属性是“一个TimeSpan对象,表示事件在音频输出流中的时间位置” 它应该是一个准确的时间,指示单词在输出文件中开始或结束说话的时间,还是我误解了它?音频位置取决
它应该是一个准确的时间,指示单词在输出文件中开始或结束说话的时间,还是我误解了它?音频位置取决于语音合成器的选定语音。对于一些Microsoft语音,如Anna、Zira、David、Hazel,正如我所经历的那样,支持的音频格式是16000Hz PCM。因此,以下解决方案可以纠正auido位置:
var format =
new System.Speech.AudioFormat.SpeechAudioFormatInfo(EncodingFormat.Pcm,
16000, 16, 1, 32000, 2, null);
synthesizer.SetOutputToWaveFile("Test.wav", format);
如果您注意到,
SetOutputToWaveFile
的默认采样率为22050,正确时间(15.69)与AudipPosition
(20.25)显示的时间的比率约为0.77。如果将该比率乘以22050,则得到大约16000,这是正确的采样率。所选语音是多少?
SpeakProgress: AudioPosition=00:00:00.0043750, CharacterPosition=0, CharacterCount=4, Text=This
SpeakProgress: AudioPosition=00:00:00.2925625, CharacterPosition=5, CharacterCount=7, Text=holiday
SpeakProgress: AudioPosition=00:00:00.9086250, CharacterPosition=13, CharacterCount=6, Text=season
SpeakProgress: AudioPosition=00:00:01.9421250, CharacterPosition=21, CharacterCount=7, Text=support
SpeakProgress: AudioPosition=00:00:02.5621250, CharacterPosition=29, CharacterCount=3, Text=the
SpeakProgress: AudioPosition=00:00:02.6760625, CharacterPosition=33, CharacterCount=5, Text=music
SpeakProgress: AudioPosition=00:00:03.2648125, CharacterPosition=39, CharacterCount=3, Text=you
SpeakProgress: AudioPosition=00:00:03.5199375, CharacterPosition=43, CharacterCount=4, Text=love
SpeakProgress: AudioPosition=00:00:03.8435625, CharacterPosition=48, CharacterCount=2, Text=by
SpeakProgress: AudioPosition=00:00:04.0701875, CharacterPosition=51, CharacterCount=8, Text=shopping
SpeakProgress: AudioPosition=00:00:04.6840625, CharacterPosition=60, CharacterCount=2, Text=at
SpeakProgress: AudioPosition=00:00:04.8036250, CharacterPosition=63, CharacterCount=4, Text=Made
SpeakProgress: AudioPosition=00:00:05.0698125, CharacterPosition=68, CharacterCount=2, Text=in
SpeakProgress: AudioPosition=00:00:05.2521250, CharacterPosition=71, CharacterCount=10, Text=Washington
SpeakProgress: AudioPosition=00:00:06.2961875, CharacterPosition=83, CharacterCount=6, Text=online
SpeakProgress: AudioPosition=00:00:07.0540625, CharacterPosition=90, CharacterCount=3, Text=and
SpeakProgress: AudioPosition=00:00:07.3331250, CharacterPosition=94, CharacterCount=2, Text=at
SpeakProgress: AudioPosition=00:00:07.6818750, CharacterPosition=97, CharacterCount=3, Text=one
SpeakProgress: AudioPosition=00:00:08.0598750, CharacterPosition=101, CharacterCount=2, Text=of
SpeakProgress: AudioPosition=00:00:08.2163750, CharacterPosition=104, CharacterCount=4, Text=five
SpeakProgress: AudioPosition=00:00:08.5971875, CharacterPosition=109, CharacterCount=5, Text=local
SpeakProgress: AudioPosition=00:00:09.0243750, CharacterPosition=115, CharacterCount=6, Text=stores
SpeakProgress: AudioPosition=00:00:10.5325625, CharacterPosition=123, CharacterCount=4, Text=Made
SpeakProgress: AudioPosition=00:00:10.7700625, CharacterPosition=128, CharacterCount=2, Text=in
SpeakProgress: AudioPosition=00:00:10.9377500, CharacterPosition=131, CharacterCount=10, Text=Washington
SpeakProgress: AudioPosition=00:00:11.6708125, CharacterPosition=142, CharacterCount=10, Text=chocolates
SpeakProgress: AudioPosition=00:00:12.9798750, CharacterPosition=154, CharacterCount=9, Text=bountiful
SpeakProgress: AudioPosition=00:00:13.6303125, CharacterPosition=164, CharacterCount=4, Text=gift
SpeakProgress: AudioPosition=00:00:14.0959375, CharacterPosition=169, CharacterCount=7, Text=baskets
SpeakProgress: AudioPosition=00:00:14.7848125, CharacterPosition=177, CharacterCount=3, Text=and
SpeakProgress: AudioPosition=00:00:15.0507500, CharacterPosition=181, CharacterCount=9, Text=ornaments
SpeakProgress: AudioPosition=00:00:15.7195000, CharacterPosition=191, CharacterCount=3, Text=are
SpeakProgress: AudioPosition=00:00:15.9872500, CharacterPosition=195, CharacterCount=3, Text=the
SpeakProgress: AudioPosition=00:00:16.1488750, CharacterPosition=199, CharacterCount=7, Text=perfect
SpeakProgress: AudioPosition=00:00:16.7275000, CharacterPosition=207, CharacterCount=7, Text=holiday
SpeakProgress: AudioPosition=00:00:17.3336875, CharacterPosition=215, CharacterCount=5, Text=gifts
SpeakProgress: AudioPosition=00:00:17.9813125, CharacterPosition=221, CharacterCount=3, Text=for
SpeakProgress: AudioPosition=00:00:18.2216875, CharacterPosition=225, CharacterCount=6, Text=family
SpeakProgress: AudioPosition=00:00:19.0973750, CharacterPosition=233, CharacterCount=7, Text=friends
SpeakProgress: AudioPosition=00:00:19.7726250, CharacterPosition=241, CharacterCount=3, Text=and
SpeakProgress: AudioPosition=00:00:19.9655625, CharacterPosition=245, CharacterCount=10, Text=co-workers
SpeakProgress: AudioPosition=00:00:20.2518750, CharacterPosition=245, CharacterCount=10, Text=co-workers
var format =
new System.Speech.AudioFormat.SpeechAudioFormatInfo(EncodingFormat.Pcm,
16000, 16, 1, 32000, 2, null);
synthesizer.SetOutputToWaveFile("Test.wav", format);