Google cloud platform 谷歌云语音API的异步问题
我正试图从斑点websocket音频流中获得最终的语音转录/识别结果。方法Google cloud platform 谷歌云语音API的异步问题,google-cloud-platform,streaming,websocket,audio,Google Cloud Platform,Streaming,Websocket,Audio,我正试图从斑点websocket音频流中获得最终的语音转录/识别结果。方法OnOpen在首次建立websocket连接时执行代码,OnBinary方法在从客户端接收二进制数据时执行代码。我测试了websocket,将语音回传到websocket中,并以相同的速率将相同的二进制数据写入websocket。这个测试有效,因此我知道二进制数据被正确发送(640字节消息,帧大小为20ms) 因此,我的代码失败,而不是服务失败。我的目标是做到以下几点: 创建websocket连接时,将初始音频配置请求发送
OnOpen
在首次建立websocket连接时执行代码,OnBinary
方法在从客户端接收二进制数据时执行代码。我测试了websocket,将语音回传到websocket中,并以相同的速率将相同的二进制数据写入websocket。这个测试有效,因此我知道二进制数据被正确发送(640字节消息,帧大小为20ms)
因此,我的代码失败,而不是服务失败。我的目标是做到以下几点:
singleutrance==true
isFinal==true
isFinal==true
时,停止当前流式处理请求并创建新请求-重复步骤1到4socket.OnOpen = () =>
{
firstMessage = true;
};
socket.OnBinary = async binary =>
{
var speech = SpeechClient.Create();
var streamingCall = speech.StreamingRecognize();
if (firstMessage == true)
{
await streamingCall.WriteAsync(
new StreamingRecognizeRequest()
{
StreamingConfig = new StreamingRecognitionConfig()
{
Config = new RecognitionConfig()
{
Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
SampleRateHertz = 16000,
LanguageCode = "en",
},
SingleUtterance = true,
}
});
Task getUtterance = Task.Run(async () =>
{
while (await streamingCall.ResponseStream.MoveNext(
default(CancellationToken)))
{
foreach (var result in streamingCall.ResponseStream.Current.Results)
{
if (result.IsFinal == true)
{
Console.WriteLine("This test finally worked");
}
}
}
});
firstMessage = false;
}
else if (firstMessage == false)
{
streamingCall.WriteAsync(new StreamingRecognizeRequest()
{
AudioContent = Google.Protobuf.ByteString.CopyFrom(binary, 0, 640)
}).Wait();
}
};
市长的问题是设置一段流来发送演讲请求。我找到了可以帮助您进行WebSocket和语音集成的代码,看看管理Google语音请求的功能:
function startRecognitionStream(client, data) {
recognizeStream = speechClient.streamingRecognize(request)
.on('error', console.error)
.on('data', (data) => {
process.stdout.write(
(data.results[0] && data.results[0].alternatives[0])
? `Transcription: ${data.results[0].alternatives[0].transcript}\n`
: `\n\nReached transcription time limit, press Ctrl+C\n`);
client.emit('speechData', data);
// if end of utterance, let's restart stream
// this is a small hack. After 65 seconds of silence, the stream will still throw an error for speech length limit
if (data.results[0] && data.results[0].isFinal) {
stopRecognitionStream();
startRecognitionStream(client);
// console.log('restarted stream serverside');
}
});
}
请记住,糟糕的音频质量会带来糟糕的结果。尝试按照有关音频的说明进行操作
我应该认识开发人员(Vinzenz Aubry),因为他/她的程序运行得很好