Audio 有没有一款软件可以自动对音频文件的一部分进行置乱/模糊处理?
这是我在这里发布的第二个问题,如果我做错了什么,请告诉我 我今天面临一个有趣的问题。我在一家呼叫中心工作,我公司的一位客户核实信息,希望从客户那里收集银行账号,并希望我们的客户服务代理将所述银行账号输入客户的外部网站Audio 有没有一款软件可以自动对音频文件的一部分进行置乱/模糊处理?,audio,speech-recognition,scramble,Audio,Speech Recognition,Scramble,这是我在这里发布的第二个问题,如果我做错了什么,请告诉我 我今天面临一个有趣的问题。我在一家呼叫中心工作,我公司的一位客户核实信息,希望从客户那里收集银行账号,并希望我们的客户服务代理将所述银行账号输入客户的外部网站 这些银行账号不会保存在本地数据库的任何地方,但CSR收集银行账号的音频将保存在我们的系统中。纯文本将不可用,但声音文件将可用。我的问题是,是否有一种方法可以使用程序自动对录音的某一部分进行置乱?我知道这是一个严重的黑暗拍摄。谢谢。虽然您的问题没有询问与编程相关的具体问题,但我会尝试
这些银行账号不会保存在本地数据库的任何地方,但CSR收集银行账号的音频将保存在我们的系统中。纯文本将不可用,但声音文件将可用。我的问题是,是否有一种方法可以使用程序自动对录音的某一部分进行置乱?我知道这是一个严重的黑暗拍摄。谢谢。虽然您的问题没有询问与编程相关的具体问题,但我会尝试回答,因为我正在研究类似的问题 我们可以使用程序自动对录音的某一部分进行置乱吗?
我们当然可以。这将取决于你想让它变得多复杂 虽然有复杂的方法,但从一个非常基本的概念角度来看,我们需要在以下阶段获取录制的音频文件并进行处理
字里行间
jumble
,用白噪声
或编码
填充置乱
连接
)所有单词并存储
它0
的开源预录wav文件下载到10
并使用pydub
将它们连接起来该程序将给定的音频文件在一秒钟内分片。我过去常常把每个单词分开,这样它们就可以放在一秒钟的时间内。在现实生活中,情况并非如此 2) 然后,它将单词传递给用户,并显示已识别的单词。正如您将看到的,单词
six
无法正确识别。为此,您需要一个强大的语音识别引擎
3) 该程序提供三种不同的加扰方法
- a) 倒装
- b) 用等效的白噪声替换单词
- c) 将单词替换为静音
4) 然后选择三个单词9
、4
和2
,并应用上述scramble
方法并替换相应的单词文件
5) 然后,它将所有的字和加扰的字按适当的顺序连接起来,并创建输出文件
注意:我没有足够的时间来添加要加扰的单词和已识别单词之间的比较
如果有任何问题,请告诉我
****演示代码:****
""" Declarations """
import speech_recognition as sr
from pydub import AudioSegment
from pydub.silence import split_on_silence
from pydub.generators import WhiteNoise
from pydub.playback import play
""" Function for Speech Recognition """
def processAudio(WAV_FILE):
r = sr.Recognizer()
with sr.WavFile(WAV_FILE) as source:
audio = r.record(source) # read the entire WAV file
# recognize speech using Google Speech Recognition
try:
print("recognizedWord=" + r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results from GSR; {0}".format(e))
""" Function to scramble word based upon choice """
def scramble_audio(aWord, option):
scramble_file = export_path + "slice" + str(aWord) +".wav"
scramble_audioseg = AudioSegment.from_wav(scramble_file)
aWord_length = scramble_audioseg.__len__() #Get length of word segment to scramble
if option == "reverse": #Reverse word to scramble
scrambled_word = scramble_audioseg.reverse()
elif option == "whiteNoise": #Replace word to scramble with white noise
wn = WhiteNoise() #Instantiate White Noise Object
aWord_length = scramble_audioseg.__len__() #Get length of word segment
scrambled_word = wn.to_audio_segment(duration=aWord_length) #Create audio_segment
elif option == "silence": #Replace word to scramble with silence
scrambled_word = AudioSegment.silent(duration=aWord_length)
print ("Scrambling and Exporting %s" % scramble_file)
scrambled_word.export(scramble_file, format="wav") #Export merged audio file
if __name__ == "__main__":
export_path = ".//splitAudio//"
in_audio_file = "0-10.wav"
out_audio_file = export_path + "scrambledAudio.wav"
#Read main audio file to be processed. Assuming in the same folder as this script
sound = AudioSegment.from_wav(in_audio_file)
sec2_splice = 1 #Splice threshold in sec
audio_length = len(sound) # Total Audio Length In millisec
q, r = divmod(audio_length, sec2_splice) #Get quotient and remainder
#Get total segments and rounds to next greater integer
total_segments= (q + int(bool(r)) ) / 1000 #Converting to sec
#Iterate through slices every one second and export
print ("")
n=0
while n <= total_segments:
print ("Making slice from %d to %d (sec)" % (n , sec2_splice ))
temp_object = sound[ (n * 1000) : (sec2_splice * 1000)] #Slicing is done in millisec
myaudio_file = export_path + "slice" + str(n) +".wav"
temp_object.export(myaudio_file , format="wav")
print ("Trying to recognize %d " %n)
processAudio(myaudio_file)
n = sec2_splice
sec2_splice += 1
#Scramble desired audio slice
print ("")
scramble_word = 9
scramble_audio(scramble_word, "reverse" )
scramble_word = 4
scramble_audio(scramble_word, "whiteNoise" )
scramble_word = 2
scramble_audio(scramble_word, "silence" )
#Combine modified audio
final_audio = AudioSegment.empty() #Create empty AudioSegment
print ("")
i = 0
while i <= total_segments:
temp_audio_file = export_path + "slice" + str(i) +".wav"
temp_audio_seg = AudioSegment.from_wav(temp_audio_file)
print ("Combining %s" % temp_audio_file )
final_audio = final_audio.append(temp_audio_seg, crossfade=0)
i += 1
print ("Exporting final audio %s" % out_audio_file )
final_audio.export(out_audio_file , format="wav")
虽然您的问题并没有要求具体的编程相关问题,但我将尝试回答它,因为我正在从事类似的工作
我们可以使用程序自动对录音的某一部分进行置乱吗?
我们当然可以。这将取决于你想让它变得多复杂
虽然有复杂的方法,但从一个非常基本的概念角度来看,我们需要在以下阶段获取录制的音频文件并进行处理
在音频文件中拆分单词:这将需要静音识别
字里行间
通过语音识别系统传递每个单词
想出一个方法来搅乱。你想保持沉默吗,
jumble
,用白噪声
或编码
填充
将识别出的单词与要加扰的单词进行比较,
如果存在基于所选方法的匹配置乱
按正确的顺序组合(连接
)所有单词并存储
它
我已经建立了一个基本的原型,除了(4)之外,它可以完成上面的工作。
该程序大量利用了音频,从而提供了更简单的音频处理方法。可以找到关于这方面的教程
这个节目基本上,
1) 我将数字0
的开源预录wav文件下载到10
并使用pydub
将它们连接起来
该程序将给定的音频文件在一秒钟内分片。我过去常常把每个单词分开,这样它们就可以放在一秒钟的时间内。在现实生活中,情况并非如此
2) 然后,它将单词传递给用户,并显示已识别的单词。正如您将看到的,单词six
无法正确识别。为此,您需要一个强大的语音识别引擎
3) 该程序提供三种不同的加扰方法
- a) 倒装
- b) 用等效的白噪声替换单词
- c) 将单词替换为静音
4) 然后选择三个单词9
、4
和2
,并应用上述scramble
方法并替换相应的单词文件
5) 然后,它将所有的字和加扰的字按适当的顺序连接起来,并创建输出文件
注意:我没有足够的时间来添加要加扰的单词和已识别单词之间的比较
如果有任何问题,请告诉我
****演示代码:****
""" Declarations """
import speech_recognition as sr
from pydub import AudioSegment
from pydub.silence import split_on_silence
from pydub.generators import WhiteNoise
from pydub.playback import play
""" Function for Speech Recognition """
def processAudio(WAV_FILE):
r = sr.Recognizer()
with sr.WavFile(WAV_FILE) as source:
audio = r.record(source) # read the entire WAV file
# recognize speech using Google Speech Recognition
try:
print("recognizedWord=" + r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results from GSR; {0}".format(e))
""" Function to scramble word based upon choice """
def scramble_audio(aWord, option):
scramble_file = export_path + "slice" + str(aWord) +".wav"
scramble_audioseg = AudioSegment.from_wav(scramble_file)
aWord_length = scramble_audioseg.__len__() #Get length of word segment to scramble
if option == "reverse": #Reverse word to scramble
scrambled_word = scramble_audioseg.reverse()
elif option == "whiteNoise": #Replace word to scramble with white noise
wn = WhiteNoise() #Instantiate White Noise Object
aWord_length = scramble_audioseg.__len__() #Get length of word segment
scrambled_word = wn.to_audio_segment(duration=aWord_length) #Create audio_segment
elif option == "silence": #Replace word to scramble with silence
scrambled_word = AudioSegment.silent(duration=aWord_length)
print ("Scrambling and Exporting %s" % scramble_file)
scrambled_word.export(scramble_file, format="wav") #Export merged audio file
if __name__ == "__main__":
export_path = ".//splitAudio//"
in_audio_file = "0-10.wav"
out_audio_file = export_path + "scrambledAudio.wav"
#Read main audio file to be processed. Assuming in the same folder as this script
sound = AudioSegment.from_wav(in_audio_file)
sec2_splice = 1 #Splice threshold in sec
audio_length = len(sound) # Total Audio Length In millisec
q, r = divmod(audio_length, sec2_splice) #Get quotient and remainder
#Get total segments and rounds to next greater integer
total_segments= (q + int(bool(r)) ) / 1000 #Converting to sec
#Iterate through slices every one second and export
print ("")
n=0
while n <= total_segments:
print ("Making slice from %d to %d (sec)" % (n , sec2_splice ))
temp_object = sound[ (n * 1000) : (sec2_splice * 1000)] #Slicing is done in millisec
myaudio_file = export_path + "slice" + str(n) +".wav"
temp_object.export(myaudio_file , format="wav")
print ("Trying to recognize %d " %n)
processAudio(myaudio_file)
n = sec2_splice
sec2_splice += 1
#Scramble desired audio slice
print ("")
scramble_word = 9
scramble_audio(scramble_word, "reverse" )
scramble_word = 4
scramble_audio(scramble_word, "whiteNoise" )
scramble_word = 2
scramble_audio(scramble_word, "silence" )
#Combine modified audio
final_audio = AudioSegment.empty() #Create empty AudioSegment
print ("")
i = 0
while i <= total_segments:
temp_audio_file = export_path + "slice" + str(i) +".wav"
temp_audio_seg = AudioSegment.from_wav(temp_audio_file)
print ("Combining %s" % temp_audio_file )
final_audio = final_audio.append(temp_audio_seg, crossfade=0)
i += 1
print ("Exporting final audio %s" % out_audio_file )
final_audio.export(out_audio_file , format="wav")
嗨,贾斯汀,这个网站我