Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/362.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用Python语音识别将Google Cloud Speech的单_语音设置为文本API_Python_Google Api_Python 3.6_Speech Recognition_Speech To Text - Fatal编程技术网

如何使用Python语音识别将Google Cloud Speech的单_语音设置为文本API

如何使用Python语音识别将Google Cloud Speech的单_语音设置为文本API,python,google-api,python-3.6,speech-recognition,speech-to-text,Python,Google Api,Python 3.6,Speech Recognition,Speech To Text,我正在使用GoogleAPI编写一个简单的计算器,需要存储从流音频输入中识别出的文本来进一步处理它。API有一个单音发音配置选项,似乎非常有用,但我找不到设置它的方法 在“云文本到语音”中,有一句话是这样说的: 在对象中将single_话语字段设置为true 语音识别示例代码 识别器类来自同一个 def-recognize\u-google(self,audio\u-data,key=None,language=“en-US”,pfilter=0,show\u-all=False): """ 使

我正在使用GoogleAPI编写一个简单的计算器,需要存储从流音频输入中识别出的文本来进一步处理它。API有一个单音发音配置选项,似乎非常有用,但我找不到设置它的方法

在“云文本到语音”中,有一句话是这样说的:

在对象中将single_话语字段设置为true

语音识别示例代码

识别器类来自同一个

def-recognize\u-google(self,audio\u-data,key=None,language=“en-US”,pfilter=0,show\u-all=False):
"""
使用Google语音识别API对“音频数据”(一个“音频数据”实例)执行语音识别。
Google语音识别API密钥由“`key`”指定。如果未指定,它将使用开箱即用的通用密钥。这通常仅用于个人或测试目的,因为它**可能随时被Google撤销**。
要获得自己的API密钥,只需按照Chromium开发者网站“API密钥”页面上的步骤操作即可。在Google开发者控制台中,Google语音识别被列为“语音API”。
识别语言由“`language``确定,这是一个RFC5646语言标记,如“en US”(美国英语)或“fr fr”(国际法语),默认为美国英语。支持的语言标记列表可在此“StackOverflow answer”中找到。
亵渎过滤级别可以使用“pfilter”进行调整:0-无过滤,1-仅显示第一个字符,并用星号替换其余字符。默认级别为0。
如果“show\u all”为false(默认值),则返回最可能的转录。否则,将原始API响应作为JSON字典返回。
如果语音无法理解,则引发“speech\u recognition.UnknownValueError”异常。如果语音识别操作失败、密钥无效或没有internet连接,则引发“speech\u recognition.RequestError”异常。
"""
断言isinstance(音频数据,音频数据),“`audio\u data``必须是音频数据”
assert key为None或isinstance(key,str),“`key``必须为`None``或字符串”
断言isinstance(language,str),“`language``必须是字符串”
flac\u data=音频数据。获取flac\u数据(
convert_rate=None如果音频_data.sample_rate>=8000,否则为8000,#音频采样必须至少为8 kHz
convert_width=2#音频采样必须为16位
)
如果键为None:key=“AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw”
url=”http://www.google.com/speech-api/v2/recognize?{}.格式(urlencode)({
“客户”:“铬”,
“郎”:语言,
“关键”:关键,
“pFilter”:pFilter
}))
request=request(url,data=flac_data,headers={“内容类型”:“audio/x-flac;rate={}”。格式(audio_data.sample_rate)})
#获得音频转录结果
尝试:
响应=urlopen(请求,超时=self.operation\u超时)
除HTTPError作为e外:
raise RequestError(“识别请求失败:{}”。格式(e.reason))
除URLE错误外:
raise RequestError(“识别连接失败:{}”。格式(e.reason))
response_text=response.read().decode(“utf-8”)
#忽略任何空白块
实际结果=[]
对于响应中的行\u text.split(“\n”):
如果不是第行:继续
result=json.load(第行)[“result”]
如果len(结果)!=0:
实际结果=结果[0]
打破
#返回结果
如果全部显示:返回实际结果
如果不是isinstance(实际结果,dict)或len(实际结果.get(“可选”,[]))==0:raise UnknownValueError()
如果对实际结果的“信心”[“备选方案”]:
#返回置信度最高的备选方案
最佳假设=最大值(实际结果[“备选方案”],关键=lambda备选方案:备选方案[“信心”])
其他:
#当没有可用的置信度时,我们任意选择第一个假设。
最佳假设=实际结果[“备选方案”][0]
如果“转录本”不在最佳假设中:提出未知值错误()
返回最佳假设[“转录本”]
识别器类中的recognize_google()函数似乎没有可以将single_话语字段设置为true的可通过参数

除了github,Google文档实际上并不包括语音识别库


我试图更改密钥,以便使用我自己的API进行连接,并从控制台的角度进行观察,但我也遇到了问题。

这里是github中的@Belegnar,它是google()函数的一部分:url=“?{}”。format(urlencode({“客户端”):“chromium”,“lang”:language,“key”:key,“pFilter”:pFilter})在文档中,我找到了流式识别配置,它包括单音发音。似乎库中的url和配置需要更改。问题:流媒体识别和单音发音的url编码是怎样的?我真的需要更改库代码吗?
#!/usr/bin/env python3

# NOTE: this example requires PyAudio because it uses the Microphone class

from threading import Thread
try:
    from queue import Queue  # Python 3 import
except ImportError:
    from Queue import Queue  # Python 2 import

import speech_recognition as sr


r = sr.Recognizer()
audio_queue = Queue()


def recognize_worker():
    # this runs in a background thread
    while True:
        audio = audio_queue.get()  # retrieve the next audio processing job from the main thread
        if audio is None: break  # stop processing if the main thread is done

        # received audio data, now we'll recognize it using Google Speech Recognition
        try:
            # for testing purposes, we're just using the default API key
            # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
            # instead of `r.recognize_google(audio)`
            print("Google Speech Recognition thinks you said " + r.recognize_google(audio))
        except sr.UnknownValueError:
            print("Google Speech Recognition could not understand audio")
        except sr.RequestError as e:
            print("Could not request results from Google Speech Recognition service; {0}".format(e))

        audio_queue.task_done()  # mark the audio processing job as completed in the queue


# start a new thread to recognize audio, while this thread focuses on listening
recognize_thread = Thread(target=recognize_worker)
recognize_thread.daemon = True
recognize_thread.start()
with sr.Microphone() as source:
    try:
        while True:  # repeatedly listen for phrases and put the resulting audio on the audio processing job queue
            audio_queue.put(r.listen(source))
    except KeyboardInterrupt:  # allow Ctrl + C to shut down the program
        pass

audio_queue.join()  # block until all current audio processing jobs are done
audio_queue.put(None)  # tell the recognize_thread to stop
recognize_thread.join()  # wait for the recognize_thread to actually stop
    def recognize_google(self, audio_data, key=None, language="en-US", pfilter=0, show_all=False):
        """
        Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using the Google Speech Recognition API.
        The Google Speech Recognition API key is specified by ``key``. If not specified, it uses a generic key that works out of the box. This should generally be used for personal or testing purposes only, as it **may be revoked by Google at any time**.
        To obtain your own API key, simply following the steps on the `API Keys <http://www.chromium.org/developers/how-tos/api-keys>`__ page at the Chromium Developers site. In the Google Developers Console, Google Speech Recognition is listed as "Speech API".
        The recognition language is determined by ``language``, an RFC5646 language tag like ``"en-US"`` (US English) or ``"fr-FR"`` (International French), defaulting to US English. A list of supported language tags can be found in this `StackOverflow answer <http://stackoverflow.com/a/14302134>`__.
        The profanity filter level can be adjusted with ``pfilter``: 0 - No filter, 1 - Only shows the first character and replaces the rest with asterisks. The default is level 0.
        Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the raw API response as a JSON dictionary.
        Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if the speech recognition operation failed, if the key isn't valid, or if there is no internet connection.
        """
        assert isinstance(audio_data, AudioData), "``audio_data`` must be audio data"
        assert key is None or isinstance(key, str), "``key`` must be ``None`` or a string"
        assert isinstance(language, str), "``language`` must be a string"

        flac_data = audio_data.get_flac_data(
            convert_rate=None if audio_data.sample_rate >= 8000 else 8000,  # audio samples must be at least 8 kHz
            convert_width=2  # audio samples must be 16-bit
        )
        if key is None: key = "AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw"
        url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
            "client": "chromium",
            "lang": language,
            "key": key,
            "pFilter": pfilter
        }))
        request = Request(url, data=flac_data, headers={"Content-Type": "audio/x-flac; rate={}".format(audio_data.sample_rate)})

        # obtain audio transcription results
        try:
            response = urlopen(request, timeout=self.operation_timeout)
        except HTTPError as e:
            raise RequestError("recognition request failed: {}".format(e.reason))
        except URLError as e:
            raise RequestError("recognition connection failed: {}".format(e.reason))
        response_text = response.read().decode("utf-8")

        # ignore any blank blocks
        actual_result = []
        for line in response_text.split("\n"):
            if not line: continue
            result = json.loads(line)["result"]
            if len(result) != 0:
                actual_result = result[0]
                break

        # return results
        if show_all: return actual_result
        if not isinstance(actual_result, dict) or len(actual_result.get("alternative", [])) == 0: raise UnknownValueError()

        if "confidence" in actual_result["alternative"]:
            # return alternative with highest confidence score
            best_hypothesis = max(actual_result["alternative"], key=lambda alternative: alternative["confidence"])
        else:
            # when there is no confidence available, we arbitrarily choose the first hypothesis.
            best_hypothesis = actual_result["alternative"][0]
        if "transcript" not in best_hypothesis: raise UnknownValueError()
        return best_hypothesis["transcript"]