使用PyAudio在Python中通过环回（'；您所听到的'；）录制麦克风声音和扬声器声音_Python_Loopback_Pyaudio_Wasapi

使用PyAudio在Python中通过环回（'；您所听到的'；）录制麦克风声音和扬声器声音

python

使用PyAudio在Python中通过环回（'；您所听到的'；）录制麦克风声音和扬声器声音,python,loopback,pyaudio,wasapi,Python,Loopback,Pyaudio,Wasapi,我正在尝试制作一个程序，它允许我使用Pyaudio捕捉我的麦克风声音，以及我可以通过Python通过扬声器听到的每个声音。首先，我创建了一个从麦克风录制声音的流，以及另一个使用环回模式的流，因此它将录制扬声器的输出。正如我们所知，官方的PyAudio构建无法记录输出。但在WindowsVista及以上版本中，引入了一种新的API，WASAPI，它包括以环回模式向输出设备打开流的能力。在这种模式下，流的行为将类似于输入流，能够记录传出的音频流，所以这就是我所做的因此，使用此解决方案，音频文件创建

我正在尝试制作一个程序，它允许我使用Pyaudio捕捉我的麦克风声音，以及我可以通过Python通过扬声器听到的每个声音。首先，我创建了一个从麦克风录制声音的流，以及另一个使用环回模式的流，因此它将录制扬声器的输出。正如我们所知，官方的PyAudio构建无法记录输出。但在WindowsVista及以上版本中，引入了一种新的API，WASAPI，它包括以环回模式向输出设备打开流的能力。在这种模式下，流的行为将类似于输入流，能够记录传出的音频流，所以这就是我所做的

因此，使用此解决方案，音频文件创建成功，但在录制过程中，我遇到了麻烦，因为它捕获了一个加扰的声音。我们可以在这里听到声音，但有时会有鼻子，所以我们不能在这里很好
任何解决方案请
谢谢你的回答：）

导入波将numpy作为np导入 #全球进口导入操作系统导入系统导入时间输入键盘输入波导入pydub 从pydub.utils导入mediainfo 导入请求从线程导入线程 #使用UBIQUS API的日志和信息 “”“payload={'apiKey'：'b5eb9744c06e4a8a802824d740f8abf'，'language'：'french'} 标题={}“” defaultframes=512 区块=1024 格式=pyaudio.paInt16 费率=48000 记录时间=10秒 WAVE\u OUTPUT\u FILENAME=“tmp.wav” 类挂钩： begin='[' 结束=']' 录制的_帧=[] 设备信息={} useloopback=False 记录时间=1 #使用模块 p=pyaudio.pyaudio（） #将默认值设置为列表中的第一个或询问窗口尝试：默认设备索引=p.获取默认设备输入设备信息（）除IOError外：默认设备索引=-1 #选择设备打印（hooks.begin+“可用设备：”+hooks.end+“\n”）对于范围内的i（0，p.获取设备计数（））： info=p.通过索引获取设备信息（i） is_wasapi=（p.get_host_api_info_by_index（info[“hostApi”]）[“name”]）。find（“wasapi”）！=-1. is_casque=str（信息[“名称]）。查找（“casque”）！=-1. is_speakers=str（信息[“名称]）。查找（“发言人”）！=-1. 如果是wasapi和（是casque还是is_扬声器）：打印（hooks.begin+str（info[“index”]）+hooks.end+“：\t%s\n\t%s\n”%（info[“name”]、p.get_host_api_info_by_index（info[“hostApi”]）[“name”]）打印（“MaxOutputChannel=，info[“MaxOutputChannel”]）打印（“maxInputChannels=，info[“maxInputChannels”]）如果默认_设备_索引==-1：默认设备索引=信息[“索引”] #处理没有可用的设备如果默认_设备_索引==-1：打印（hooks.begin+“没有可用设备。正在退出。”+hooks.end）退出（） #获取输入或默认值设备id=int（输入（“选择设备编号”+hooks.begin+hooks.end+”：）或默认设备索引）打印（“”） #获取设备信息尝试：设备信息=p.通过索引（设备id）获取设备信息除IOError外：设备信息=p.通过设备索引获取设备信息（默认设备索引）打印（hooks.begin+“选择不可用，使用默认值。”+hooks.end） #选择环回模式或标准模式输入=设备信息[“MaxInputChannel”]>0 is_wasapi=（p.get_host_api_info_by_index（device_info[“hostApi”]）[“name”]）。查找（“wasapi”）！=-1. 如果是_输入：打印（hooks.begin+“选择使用标准模式输入。”+hooks.end）其他：如果是_wasapi： useloopback=True；打印（hooks.begin+“选择输出。使用环回模式。”+hooks.end）其他：打印（hooks.begin+“选择是输入的，不支持环回模式。退出。”+hooks.end）退出（） #使用as_环回从操作系统获取声音的流 #明流 channelcount=设备信息[“maxInputChannels”]如果（设备信息[“maxOutputChannels”]<设备信息[“maxInputChannels”]）其他设备信息[“maxOutputChannels”] 打印（“配置生效。\n ISIR Ctrl+s+t PULL débuter l'Enregistration.\n”） #等待用户按Enter键开始录制键盘。等待（'ctrl+s+t'）溪流=p.打开( 格式=pyaudio.paInt16，通道数=通道数，速率=整数（设备信息[“defaultSampleRate”]），输入=真，每缓冲区的帧数=默认帧数，输入设备索引=设备信息[“索引”]， as_环回=使用环回） ##使用我的麦克风的输入设备传输数据流 stream2=p.open( 格式=格式，通道=1，比率=比率，输入=真，每缓冲区的帧数=默认帧数，输入设备索引=1， as_环回=假）帧=[] frames2=[] #开始录音打印（“*记录”）打印（“开始：%s”%time.ctime（）对于范围内的i（0，int（速率/块*记录秒））： data=stream.read（块） data2=stream2.read（块） frames.append（数据） frames2.append（数据2） #帧=as_环回声音数据（扬声器）帧=b“”。连接（帧）； #frames2=麦克风的声音数据 frames2=b“”。加入（frames2）； #扬声器数据解码 Sdecoded=np.frombuffer（帧'int16'） #解码麦克风数据 mdecode=np.frombuffer（frames2，'int16'）打印（“Mdecoded=”，Mdecoded，np.大小（Mdecoded）） #将扬声器数据转换为Numpy矢量（使拾取音频通道时的生活更轻松） sdeconded=np.array（sdeconded，dtype='int16'）打印（“Sdecoded=”，Sdecoded，np.大小（Sdecoded）） #获取正确的数据 rightData=Sdecoded[1:：2] 打印（“rightData=，rightData，np.size（rightData）） #从左侧获取数据 leftData=Sdecoded[：：2] 打印（“leftData=，leftData，np.size（leftData）） #将所有内容混合到单声道=添加右侧+左侧+已经是单声道的麦克风解码数据混合=（rightData+leftData+MDecode）打印（“mix=，mix，np.尺寸（mix）） #确保没有任何值超出短int的限制信号=np.剪辑（混合，-32767、32766） #再次对数据进行编码 encodecoded=wave.struct.pack（“%dh”%（len（信号）），*list（信号）） #停止所有流并终止pyaudio s import wave import numpy as np # Global imports import os import sys import time import keyboard import wave import pydub from pydub.utils import mediainfo import requests from threading import Thread # Log and information to use the UBIQUS API """payload = {'apiKey': 'b5eb9744c06e4a8a8028242d740f8abf', 'language': 'french'} headers = {}""" defaultframes = 512 CHUNK = 1024 FORMAT = pyaudio.paInt16 RATE = 48000 RECORD_SECONDS = 10 WAVE_OUTPUT_FILENAME = "tmp.wav" class hooks: begin = '[' end = ']' recorded_frames = [] device_info = {} useloopback = False recordtime = 1 #Use module p = pyaudio.PyAudio() #Set default to first in list or ask Windows try: default_device_index = p.get_default_input_device_info() except IOError: default_device_index = -1 #Select Device print (hooks.begin + "Available devices:" + hooks.end + "\n") for i in range(0, p.get_device_count()): info = p.get_device_info_by_index(i) is_wasapi = (p.get_host_api_info_by_index(info["hostApi"])["name"]).find("WASAPI") != -1 is_casque = str(info["name"]).find("Casque") != -1 is_speakers = str(info["name"]).find("Speakers") != -1 if is_wasapi and (is_casque or is_speakers): print (hooks.begin + str(info["index"]) + hooks.end + ": \t %s \n \t %s \n" % (info["name"], p.get_host_api_info_by_index(info["hostApi"])["name"])) print("maxOutputChannels = ", info["maxOutputChannels"]) print("maxInputChannels = ", info["maxInputChannels"]) if default_device_index == -1: default_device_index = info["index"] #Handle no devices available if default_device_index == -1: print (hooks.begin + "No device available. Quitting." + hooks.end) exit() #Get input or default device_id = int(input("Choose device number " + hooks.begin + hooks.end + ": ") or default_device_index) print ("") #Get device info try: device_info = p.get_device_info_by_index(device_id) except IOError: device_info = p.get_device_info_by_index(default_device_index) print (hooks.begin + "Selection not available, using default." + hooks.end) #Choose between loopback or standard mode is_input = device_info["maxInputChannels"] > 0 is_wasapi = (p.get_host_api_info_by_index(device_info["hostApi"])["name"]).find("WASAPI") != -1 if is_input: print (hooks.begin + "Selection is input using standard mode." + hooks.end) else: if is_wasapi: useloopback = True; print (hooks.begin + "Selection is output. Using loopback mode." + hooks.end) else: print (hooks.begin + "Selection is input and does not support loopback mode. Quitting." + hooks.end) exit() #stream using as_loopback to get sound from OS #Open stream channelcount = device_info["maxInputChannels"] if (device_info["maxOutputChannels"] < device_info["maxInputChannels"]) else device_info["maxOutputChannels"] print("Configuration effectuée.\nSaisir Ctrl+s+t pour débuter l'enregistrement.\n") # Waiting for the user to press Enter to start recording keyboard.wait('ctrl+s+t') stream = p.open( format = pyaudio.paInt16, channels = channelcount, rate = int(device_info["defaultSampleRate"]), input = True, frames_per_buffer = defaultframes, input_device_index = device_info["index"], as_loopback = useloopback) ##stream using my Microphone's input device stream2 = p.open( format = FORMAT, channels = 1, rate = RATE, input=True, frames_per_buffer=defaultframes, input_device_index=1, as_loopback=False) frames = [] frames2 = [] #Start Recording print("* recording") print("Start : %s" % time.ctime()) for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)): data = stream.read(CHUNK) data2 = stream2.read(CHUNK) frames.append(data) frames2.append(data2) #frames = as_loopback sound data (Speakers) frames= b''.join(frames); #frames2 = sound data of Microphone frames2= b''.join(frames2); #decoding Speaker data Sdecoded = np.frombuffer(frames, 'int16') #decoding the microphone data Mdecoded = np.frombuffer(frames2, 'int16') print("Mdecoded = ", Mdecoded, np.size(Mdecoded)) #converting Speaker data into a Numpy vector (making life easier when picking up audio channels) Sdecoded= np.array(Sdecoded, dtype='int16') print("Sdecoded =", Sdecoded, np.size(Sdecoded)) #getting the data on the right side rightData=Sdecoded[1::2] print("rightData = ", rightData, np.size(rightData)) #getting the data on the left side leftData=Sdecoded[::2] print("leftData = ", leftData, np.size(leftData)) #mixing everything to mono = add right side + left side + Microphone decoded data that is already mono mix = (rightData + leftData + Mdecoded) print("mix = ", mix, np.size(mix)) #ensuring no value goes beyond the limits of short int signal=np.clip(mix, -32767, 32766) #encode the data again encodecoded = wave.struct.pack("%dh"%(len(signal)), *list(signal)) #stop all streams and terminate pyaudio stream.stop_stream() stream.close() stream2.stop_stream() stream2.close() p.terminate() #recording mixed audio in mono wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb') wf.setnchannels(1) wf.setsampwidth(p.get_sample_size(FORMAT)) wf.setframerate(RATE) wf.writeframes((encodecoded)) wf.close()```