在python中如何将pocketsphinx（5Realpha）与gstreamer-1.0一起使用？_Python_Audio_Pocketsphinx_Python Gstreamer

在python中如何将pocketsphinx（5Realpha）与gstreamer-1.0一起使用？

python audio

在python中如何将pocketsphinx（5Realpha）与gstreamer-1.0一起使用？,python,audio,pocketsphinx,python-gstreamer,Python,Audio,Pocketsphinx,Python Gstreamer,我正在尝试创建一个小型Python脚本，它将通过网络接收音频流，通过pocketspinx将语音转换为文本，并根据pocketsphinx的输出运行一些命令我已经在Ubuntu15.10虚拟机上安装了sphinxbase和pocketsphinx（5Realpha），并且能够用Python正确处理示例音频文件（pocketsphinx安装的一部分）的内容。所以我有理由相信我的sphinx安装工作正常。不幸的是，测试python脚本无法处理连续音频，并且使用本机pocketsphinx API。

我正在尝试创建一个小型Python脚本，它将通过网络接收音频流，通过pocketspinx将语音转换为文本，并根据pocketsphinx的输出运行一些命令

我已经在Ubuntu15.10虚拟机上安装了sphinxbase和pocketsphinx（5Realpha），并且能够用Python正确处理示例音频文件（pocketsphinx安装的一部分）的内容。所以我有理由相信我的sphinx安装工作正常。不幸的是，测试python脚本无法处理连续音频，并且使用本机pocketsphinx API。根据cmusphinx网站，我应该使用gstreamer进行连续翻译。不幸的是，关于如何在Python中将pocketsphinx与gstreamer结合使用的信息相当有限。根据我能找到的例子，我拼凑了以下脚本

import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst
GObject.threads_init()
Gst.init(None)

def element_message( bus, msg ):
         msgtype = msg.get_structure().get_name()
         if msgtype != 'pocketsphinx':
                 return
         print "hypothesis= '%s'  confidence=%s\n" % (msg.get_structure().get_value('hypothesis'), msg.get_structure().get_value('confidence'))

pipeline = Gst.parse_launch('udpsrc port=3000 name=src caps=application/x-rtp ! rtppcmadepay name=rtpp ! alawdec name=decoder ! queue ! pocketsphinx name=asr ! fakesink')

asr = pipeline.get_by_name("asr")
asr.set_property("configured", "true")

bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message::element', element_message)

pipeline.set_state(Gst.State.PLAYING)

# enter into a mainloop
loop = GObject.MainLoop()
loop.run()

发送端看起来像：

import gobject, pygst
pygst.require("0.10")
import gst

pipeline = gst.parse_launch('alsasrc ! audioconvert ! audioresample ! alawenc ! rtppcmapay ! udpsink  port=3000 host=192.168.13.120')
pipeline.set_state(gst.STATE_PLAYING)
loop = gobject.MainLoop()
loop.run()

这应该从网络接收udp流，将其fead到pocketsphinx中，并将输出打印到终端。如果我替换“队列！口袋狮身人面像！伪造“部分由”wavenc！filesink'，我确实获得了一个包含正确内容的有效音频文件，因此我知道网络发送部分工作正常。（我的测试机上没有音频，因此无法使用本地音频源进行测试）

当我启动脚本时，我看到pocketspinx配置经过，但是脚本似乎不再执行任何操作。当我使用GST_DEBUG=*：4启动脚本时，我看到以下输出：

0:00:04.789157687  2220      0x86fff70 INFO               GST_EVENT gstevent.c:760:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:00.000000000, duration 99:99:99.999999999
0:00:04.789616981  2220      0x86fff70 INFO                 basesrc gstbasesrc.c:2838:gst_base_src_loop:<src> marking pending DISCONT
0:00:04.789995780  2220      0x86fff70 INFO               GST_EVENT gstevent.c:760:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:04.079311489, duration 99:99:99.999999999
0:00:04.790420834  2220      0x86fff70 INFO               GST_EVENT gstevent.c:679:gst_event_new_caps: creating caps event audio/x-raw, format=(string)S16LE, layout=(string)interleaved, rate=(int)8000, channels=(int)1
0:00:04.790851965  2220      0x86fff70 WARN                GST_PADS gstpad.c:3989:gst_pad_peer_query:<decoder:src> could not send sticky events
0:00:04.791258320  2220      0x86fff70 WARN                 basesrc gstbasesrc.c:2943:gst_base_src_loop:<src> error: Internal data flow error.
0:00:04.791572605  2220      0x86fff70 WARN                 basesrc gstbasesrc.c:2943:gst_base_src_loop:<src> error: streaming task paused, reason not-negotiated (-4)
0:00:04.791917073  2220      0x86fff70 INFO        GST_ERROR_SYSTEM gstelement.c:1837:gst_element_message_full:<src> posting message: Internal data flow error.
0:00:04.792305347  2220      0x86fff70 INFO        GST_ERROR_SYSTEM gstelement.c:1860:gst_element_message_full:<src> posted error message: Internal data flow error.
0:00:04.792633841  2220      0x86fff70 INFO                    task gsttask.c:315:gst_task_func:<src:src> Task going to paused

0:00:04.789157687 2220 0x86fff70信息GST_事件gstevent.c:760:GST_事件新_段：创建段事件时间段开始=0:00:00.000000000，偏移=0:00.000000000，停止=99:99:99.999999999，速率=1.000000，应用速率=1.000000，标志=0x00，时间=0:00:00.000000000，基数=0:00:00.000000000，位置0:00:00.000000000，持续时间99:99:99.99999999
0:00:04.789616981 2220 0x86fff70 INFO basesrc gstbasesrc.c:2838:gst_base_src_循环：标记挂起的不连续
0:00:04.789995780 2220 0x86fff70信息GST_事件gstevent.c:760:GST_事件_新_段：创建段事件时间段开始=0:00:00.000000000，偏移=0:00.000000000，停止=99:99:99.99999999，速率=1.000000，应用速率=1.000000，标志=0x00，时间=0:00:00.000000000，基数=0:00:00.000000000，位置0:00:04.079311489，持续时间99:99:99.99999999
0:00:04.790420834 2220 0x86fff70信息GST_事件gstevent.c:679:GST_事件_新的_caps:creating caps事件音频/x-raw，格式=（字符串）S16LE，布局=（字符串）交错，速率=（int）8000，通道=（int）1
0:00:04.790851965 2220 0x86fff70警告GST_pad gstpad.c:3989:GST_pad_peer_查询：无法发送粘性事件
0:00:04.791258320 2220 0x86fff70 WARN basesrc gstbasesrc.c:2943:gst_base_src_循环：错误：内部数据流错误。
0:00:04.791572605 2220 0x86fff70 WARN basesrc gstbasesrc.c:2943:gst_base_src_循环：错误：流式处理任务已暂停，原因未协商（-4）
0:00:04.791917073 2220 0x86fff70信息GST\u错误\u系统gstelement.c:1837:GST\u元素\u消息\u完整：发布消息：内部数据流错误。
0:00:04.792305347 2220 0x86fff70信息GST\u错误\u系统gstelement.c:1860:GST\u元素\u消息\u完整：发布的错误消息：内部数据流错误。
0:00:04.792633841 2220 0x86fff70信息任务gsttask.c:315:gst_任务_func:任务将暂停

根据我在谷歌上找到的信息和例子，我不明白哪里出了问题

任何帮助都将不胜感激

Nico

Gstreamer元素需要16000 khz音频，您正试图通过8000。您必须修改pocketsphinx源以启用pocketsphinx元素中的8000。您需要更新元素规格比率、pocketsphinx的samprate配置参数和声学模型

或者，您需要通过网络发送宽带音频。在这种情况下，您不应该使用alaw编解码器。

要正确设置gstreamer管道，结果有点麻烦。正如尼古拉指出的那样，pocketsphinx 5默认需要16000kHz音频，不幸的是，用gstreamer在网络上发送16000kHz的音频对我来说并不直接。因此，如果你碰巧在寻找类似的东西，下面是最终对我有用的东西：

发送方：

import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst
GObject.threads_init()
Gst.init(None)

pipeline = Gst.parse_launch('alsasrc ! audioconvert ! audio/x-raw,channels=1,depth=16,width=16,rate=16000 ! rtpL16pay  ! udpsink port=3000 host=192.168.13.120')    
pipeline.set_state(Gst.State.PLAYING)

loop = GObject.MainLoop()
loop.run()

接收方：

import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst
GObject.threads_init()
Gst.init(None)

def element_message( bus, msg ):
    msgtype = msg.get_structure().get_name()
    print "hypothesis= '%s'  confidence=%s\n" % (msg.get_structure().get_value('hypothesis'),msg.get_structure().get_value('confidence'))

pipeline = Gst.parse_launch('udpsrc port=3000 name=src ! application/x-rtp,media=(string)audio, clock-rate=(int)16000, width=16, height=16, encoding-name=(string)L16, encoding-params=(string)1, channels=(int)1, channel-positions=(int)1, payload=(int)96 ! rtpL16depay ! audioconvert ! pocketsphinx name=asr ! fakesink')
asr = pipeline.get_by_name("asr")
asr.set_property("dict", "/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict")
asr.set_property("lm","/usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin")
asr.set_property("hmm","/usr/local/share/pocketsphinx/model/en-us/en-us/")
asr.set_property("configured", "true")

bus = pipeline.get_bus(
bus.add_signal_watch()
bus.connect('message::element', element_message)

pipeline.set_state(Gst.State.PLAYING)

loop = GObject.MainLoop()

Doh，我确信Pocketsphenx使用8000kHz。不幸的是，rtppcmapay似乎不支持8000kHz以外的任何东西。但至少我让pocketsphinx使用以下管道做了一些事情：发送端'alsasrc！音频转换！音频/x-raw，速率=16000！rtpL16pay！udpsink端口=3000 h ost=192.168.13.120'，接收端udpsrc端口=3000名称=src caps=application/x-rtp！rtpL16depay！音频转换！音频重采样！pocketsphinx name=asr！假墨水'