Python pyaudio-“的;“听”;直到检测到语音,然后记录到.wav文件
我遇到了一些问题,我似乎无法理解这个概念 我想做的是: 让麦克风“聆听”声音(高于特定阈值),然后开始录制到.wav文件,直到此人停止讲话/信号不再存在。例如:Python pyaudio-“的;“听”;直到检测到语音,然后记录到.wav文件,python,multithreading,audio,pyaudio,Python,Multithreading,Audio,Pyaudio,我遇到了一些问题,我似乎无法理解这个概念 我想做的是: 让麦克风“聆听”声音(高于特定阈值),然后开始录制到.wav文件,直到此人停止讲话/信号不再存在。例如: begin: listen() -> nothing is being said listen() -> nothing is being said listen() -> VOICED - _BEGIN RECORDING_ listen() -> VOICED - _BEGIN REC
begin:
listen() -> nothing is being said
listen() -> nothing is being said
listen() -> VOICED - _BEGIN RECORDING_
listen() -> VOICED - _BEGIN RECORDING_
listen() -> UNVOICED - _END RECORDING_
end
我还想使用“线程”来实现这一点,这样就可以创建一个线程,不断地“侦听”文件,当有语音数据时,另一个线程就会开始。。但是,我一辈子都不知道该怎么做。。以下是我目前的代码:
import wave
import sys
import threading
from array import array
from sys import byteorder
try:
import pyaudio
CHECK_PYLIB = True
except ImportError:
CHECK_PYLIB = False
class Audio:
_chunk = 0.0
_format = 0.0
_channels = 0.0
_rate = 0.0
record_for = 0.0
stream = None
p = None
sample_width = None
THRESHOLD = 500
# initial constructor to accept params
def __init__(self, chunk, format, channels, rate):
#### set data-types
self._chunk = chunk
self.format = pyaudio.paInt16,
self.channels = channels
self.rate = rate
self.p = pyaudio.PyAudio();
def open(self):
# print "opened"
self.stream = self.p.open(format=pyaudio.paInt16,
channels=2,
rate=44100,
input=True,
frames_per_buffer=1024);
return True
def record(self):
# create a new instance/thread to record the sound
threading.Thread(target=self.listen).start();
def is_silence(snd_data):
return max(snd_data) < THRESHOLD
def listen(self):
r = array('h')
while True:
snd_data = array('h', self.stream.read(self._chunk))
if byteorder == 'big':
snd_data.byteswap()
r.extend(snd_data)
return sample_width, r
现在,每5秒后,我需要“process”函数执行一次,然后处理数据(time.delay(10)),同时执行此操作,然后开始备份录制。花了一些时间,我想出了以下代码,除了写入文件之外,似乎正在做您需要的事情:
import threading
from array import array
from Queue import Queue, Full
import pyaudio
CHUNK_SIZE = 1024
MIN_VOLUME = 500
# if the recording thread can't consume fast enough, the listener will start discarding
BUF_MAX_SIZE = CHUNK_SIZE * 10
def main():
stopped = threading.Event()
q = Queue(maxsize=int(round(BUF_MAX_SIZE / CHUNK_SIZE)))
listen_t = threading.Thread(target=listen, args=(stopped, q))
listen_t.start()
record_t = threading.Thread(target=record, args=(stopped, q))
record_t.start()
try:
while True:
listen_t.join(0.1)
record_t.join(0.1)
except KeyboardInterrupt:
stopped.set()
listen_t.join()
record_t.join()
def record(stopped, q):
while True:
if stopped.wait(timeout=0):
break
chunk = q.get()
vol = max(chunk)
if vol >= MIN_VOLUME:
# TODO: write to file
print "O",
else:
print "-",
def listen(stopped, q):
stream = pyaudio.PyAudio().open(
format=pyaudio.paInt16,
channels=2,
rate=44100,
input=True,
frames_per_buffer=1024,
)
while True:
if stopped.wait(timeout=0):
break
try:
q.put(array('h', stream.read(CHUNK_SIZE)))
except Full:
pass # discard
if __name__ == '__main__':
main()
看这里:
它甚至将Wav转换为flac并发送到google语音api,如果不需要,只需删除stt_google_Wav函数;)阅读强烈推荐的:)问题:您是否实际实例化过多个
音频
对象?我问这个问题是因为我不太明白为什么你要把代码放在一个类中——我得到的是一个面向初学者的Java,他们总是要求一切都是面向对象的,只是为了它。@ErikAllik我必须承认,我是Python新手:(这很明显;这就是为什么我请你参考PEP8的原因。@ErikAllik我来看看:)但是,就这个问题而言。。有什么想法吗?谢谢你的回复。我已经使用了你给我的代码,但是,我的环境已经改变,我已经尝试实现它。(请参阅上面的更新^^)我似乎不知道如何在每10秒后执行“process()”,完成处理,然后再次开始录制。。有什么建议吗?谢谢你!首先,我已经回答了你原来的问题;其次,我真的不明白你改变的情况是什么,因为你没有很清楚地描述它们。甚至你粘贴的代码也被破坏了,因为你没有花时间修复缩进/格式。事实上,我不知道你对我的原始代码段做了什么-你把它完全弄糟了。我认为你应该读一本初级Python/编程书或其他什么。这个答案太棒了。我刚刚学会了Python中的多线程。@JakeStewart很高兴自己有用!这是正确的想法,但请注意,它使用的检测方法相当原始。它只是检查麦克风强度,并假设有人在说话,如果有足够的噪音。实际上,语音是在一定的频率范围内工作的,因此,在任何环境中,即使噪声很小,代码也会有大量误报。一个更好的解决方案是使用FFT,并开始记录,如果信号被检测到,并测量声音的频谱平坦度,以了解它是浊音还是噪声
import threading
from array import array
from Queue import Queue, Full
import pyaudio
CHUNK_SIZE = 1024
MIN_VOLUME = 500
# if the recording thread can't consume fast enough, the listener will start discarding
BUF_MAX_SIZE = CHUNK_SIZE * 10
def main():
stopped = threading.Event()
q = Queue(maxsize=int(round(BUF_MAX_SIZE / CHUNK_SIZE)))
listen_t = threading.Thread(target=listen, args=(stopped, q))
listen_t.start()
record_t = threading.Thread(target=record, args=(stopped, q))
record_t.start()
try:
while True:
listen_t.join(0.1)
record_t.join(0.1)
except KeyboardInterrupt:
stopped.set()
listen_t.join()
record_t.join()
def record(stopped, q):
while True:
if stopped.wait(timeout=0):
break
chunk = q.get()
vol = max(chunk)
if vol >= MIN_VOLUME:
# TODO: write to file
print "O",
else:
print "-",
def listen(stopped, q):
stream = pyaudio.PyAudio().open(
format=pyaudio.paInt16,
channels=2,
rate=44100,
input=True,
frames_per_buffer=1024,
)
while True:
if stopped.wait(timeout=0):
break
try:
q.put(array('h', stream.read(CHUNK_SIZE)))
except Full:
pass # discard
if __name__ == '__main__':
main()