Python 3.x Google Cloud Speech to Text中enable_speaker_Dialization标记出错_Python 3.x_Google Cloud Platform_Speech To Text_Google Speech Api_Google Cloud Speech

Python 3.x Google Cloud Speech to Text中enable_speaker_Dialization标记出错

python-3.x google-cloud-platform

Python 3.x Google Cloud Speech to Text中enable_speaker_Dialization标记出错,python-3.x,google-cloud-platform,speech-to-text,google-speech-api,google-cloud-speech,Python 3.x,Google Cloud Platform,Speech To Text,Google Speech Api,Google Cloud Speech,使用谷歌语音转换文本，我能够用默认参数转录音频剪辑。但是，在使用enable_speaker_Dialization标记评测音频剪辑中的单个扬声器时，我收到一条错误消息。谷歌记录了它这是一个很长的音频剪辑，因此我使用谷歌推荐的异步请求我的代码- def transcribe_gcs(gcs_uri): from google.cloud import speech from google.cloud import speech_v1 as speech from google.cloud.s

使用谷歌语音转换文本，我能够用默认参数转录音频剪辑。但是，在使用enable_speaker_Dialization标记评测音频剪辑中的单个扬声器时，我收到一条错误消息。谷歌记录了它这是一个很长的音频剪辑，因此我使用谷歌推荐的异步请求

我的代码-

def transcribe_gcs(gcs_uri):
from google.cloud import speech
from google.cloud import speech_v1 as speech
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri = gcs_uri)
config = speech.types.RecognitionConfig(encoding=speech.enums.RecognitionConfig.AudioEncoding.FLAC, 
                                        sample_rate_hertz= 16000, 
                                        language_code = 'en-US',
                                       enable_speaker_diarization=True,
                                        diarization_speaker_count=2)

operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=3000)
result = response.results[-1]

words_info = result.alternatives[0].words

for word_info in words_info:
    print("word: '{}', speaker_tag: {}".format(word_info.word, word_info.speaker_tag))

使用后-

transcribe_gcs('gs://bucket_name/filename.flac')

我得到了错误

ValueError: Protocol message RecognitionConfig has no "enable_speaker_diarization" field.

我确信这与库有关，我已经使用了我能找到的所有变体

from google.cloud import speech_v1p1beta1 as speech
from google.cloud import speech

但我总是犯同样的错误。

注意-在运行此代码之前，我已经使用JSON文件进行了身份验证。

目前，

speech.types.RecognitionConfig

中的

enable\u speaker\u dialization=True

参数仅在库

speech\u v1p1beta1

中可用，因此，您需要导入该库才能使用该参数，不是默认的语音。我对你的代码做了一些修改，效果很好。请考虑您需要使用服务帐户来运行此代码

def转录地面军事系统（地面军事系统uri）：从google.cloud导入语音\u v1p1beta1作为语音从google.cloud.speech_v1p1beta1导入枚举从google.cloud.speech_v1p1beta1导入类型 client=speech.SpeechClient（）音频=类型。识别音频（uri=gcs\U uri） config=speech.types.RecognitionConfig（语言代码='en-US'，启用说话人对话=True，对话说话人对话=2） operation=client.long\u running\u recognize（配置、音频）打印（'等待操作完成…'）响应=操作。结果（超时=3000）结果=响应。结果[-1] words\u info=result.alternations[0]。words 标签=1 发言人=“” 对于word\u信息中的word\u信息：如果word\u info.speaker\u tag==标记：扬声器=扬声器+“”+word\u info.word 其他：打印（“扬声器{}:{}”。格式（标签，扬声器））标签=单词\信息。扬声器\标签扬声器=“word”\u info.word 打印（“扬声器{}:{}”。格式（标签，扬声器））结果应该是：

错误原因也与Node JS用户类似。通过此通话导入beta功能，然后使用扬声器识别功能

const speech = require('@google-cloud/speech').v1p1beta1;

错误是因为您尚未导入某些文件。为此，请导入以下文件

from google.cloud import speech_v1p1beta1 as speech
from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types

这工作正常，但输出不是我想要的。你能帮我算出以下格式的回答吗？发言人1：发言人说的话发言人2：发言人说的话2@AsifShaikh我编辑了我的答案，以满足您的格式要求，让我知道这是否适合您。工作完美。谢谢！使用你的代码@AlexRiquelme，我只得到输出的前2行，其他时候我只得到speaker 1:没有任何单词。使用google示例代码中的commercial_mono.wav，与上面的代码相同。此代码需要额外的

打印（“speaker{}:{}”。format（tag，speaker））

。否则它不会打印最后一句话。