使用Unity进行Microsoft Azure文本语音转换时，播放声音的开头和结尾将出现断音_Azure_Unity3d_Text To Speech_Microsoft Cognitive

使用Unity进行Microsoft Azure文本语音转换时，播放声音的开头和结尾将出现断音

azure unity3d

使用Unity进行Microsoft Azure文本语音转换时，播放声音的开头和结尾将出现断音,azure,unity3d,text-to-speech,microsoft-cognitive,Azure,Unity3d,Text To Speech,Microsoft Cognitive,我正在使用Microsoft Azure文本对Unity进行演讲。但在播放声音的开始和结束时会出现断音。这是正常的，还是结果。音频数据已损坏。下面是代码 public AudioSource audioSource; void Start() { SynthesisToSpeaker("你好世界"); } public void SynthesisToSpeaker(string text) { var config

我正在使用Microsoft Azure文本对Unity进行演讲。但在播放声音的开始和结束时会出现断音。这是正常的，还是结果。音频数据已损坏。下面是代码

    public AudioSource audioSource;
    void Start()
    {
        SynthesisToSpeaker("你好世界");
    }
    public void SynthesisToSpeaker(string text)
    {
        var config = SpeechConfig.FromSubscription("[redacted]", "southeastasia");
        config.SpeechSynthesisLanguage = "zh-CN";
        config.SpeechSynthesisVoiceName = "zh-CN-XiaoxiaoNeural";

        // Creates a speech synthesizer.
        // Make sure to dispose the synthesizer after use!       
        SpeechSynthesizer synthesizer = new SpeechSynthesizer(config, null);
        Task<SpeechSynthesisResult> task = synthesizer.SpeakTextAsync(text);
        StartCoroutine(CheckSynthesizer(task, config, synthesizer));
    }
    private IEnumerator CheckSynthesizer(Task<SpeechSynthesisResult> task,
        SpeechConfig config,
        SpeechSynthesizer synthesizer)
    {
        yield return new WaitUntil(() => task.IsCompleted);
        var result = task.Result;
        // Checks result.
        string newMessage = string.Empty;
        if (result.Reason == ResultReason.SynthesizingAudioCompleted)
        {
            var sampleCount = result.AudioData.Length / 2;
            var audioData = new float[sampleCount];
            for (var i = 0; i < sampleCount; ++i)
            {
                audioData[i] = (short)(result.AudioData[i * 2 + 1] << 8
                        | result.AudioData[i * 2]) / 32768.0F;
            }
            // The default output audio format is 16K 16bit mono
            var audioClip = AudioClip.Create("SynthesizedAudio", sampleCount,
                    1, 16000, false);
            audioClip.SetData(audioData, 0);
            audioSource.clip = audioClip;
            audioSource.Play();

        }
        else if (result.Reason == ResultReason.Canceled)
        {
            var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
        }
        synthesizer.Dispose();
    }

公共音频源音频源；
void Start（）
{
合成器扬声器（“你好世界");
}
公共扬声器（字符串文本）
{
var config=SpeechConfig.FromSubscription（“[修订]”，“东南亚”）；
config.SpeechSynthesisLanguage=“zh CN”；
config.SpeechSynthesisVoiceName=“zh CN XiaoxiaoNeural”；
//创建语音合成器。
//请务必在使用后处理合成器！
SpeechSynthesizer合成器=新的SpeechSynthesizer（配置，空）；
任务=合成器.SpeakTextAsync（文本）；
Start例程（检查合成器（任务、配置、合成器））；
}
专用IEnumerator校验合成器（任务，
SpeechConfig配置，
语音合成器（语音合成器）
{
返回新的等待时间（（）=>task.IsCompleted）；
var result=task.result；
//检查结果。
string newMessage=string.Empty；
if（result.Reason==ResultReason.synthesis已完成）
{
var sampleCount=result.AudioData.Length/2；
var audioData=新浮点数[sampleCount]；
对于（变量i=0；iaudioData[i]=（短）（结果。audioData[i*2+1]默认音频格式为RIFF16KHZ16BITMONOCM
，在result.audioData
的开头有一个riff头。如果您将audioData传递给audioClip，它将播放头，然后您会听到一些噪音
您可以通过speechConfig.SetSpeechSynthesisOutputFormat（SpeechSynthesisOutputFormat.raw16khz16bitcm）；
将格式设置为无标题的原始格式，有关详细信息，请参阅