Audio 有没有一款软件可以自动对音频文件的一部分进行置乱/模糊处理？_Audio_Speech Recognition_Scramble

Audio 有没有一款软件可以自动对音频文件的一部分进行置乱/模糊处理？

audio speech-recognition

Audio 有没有一款软件可以自动对音频文件的一部分进行置乱/模糊处理？,audio,speech-recognition,scramble,Audio,Speech Recognition,Scramble,这是我在这里发布的第二个问题，如果我做错了什么，请告诉我我今天面临一个有趣的问题。我在一家呼叫中心工作，我公司的一位客户核实信息，希望从客户那里收集银行账号，并希望我们的客户服务代理将所述银行账号输入客户的外部网站这些银行账号不会保存在本地数据库的任何地方，但CSR收集银行账号的音频将保存在我们的系统中。纯文本将不可用，但声音文件将可用。我的问题是，是否有一种方法可以使用程序自动对录音的某一部分进行置乱？我知道这是一个严重的黑暗拍摄。谢谢。虽然您的问题没有询问与编程相关的具体问题，但我会尝试

这是我在这里发布的第二个问题，如果我做错了什么，请告诉我

我今天面临一个有趣的问题。我在一家呼叫中心工作，我公司的一位客户核实信息，希望从客户那里收集银行账号，并希望我们的客户服务代理将所述银行账号输入客户的外部网站

这些银行账号不会保存在本地数据库的任何地方，但CSR收集银行账号的音频将保存在我们的系统中。纯文本将不可用，但声音文件将可用。我的问题是，是否有一种方法可以使用程序自动对录音的某一部分进行置乱？我知道这是一个严重的黑暗拍摄。谢谢。

虽然您的问题没有询问与编程相关的具体问题，但我会尝试回答，因为我正在研究类似的问题

我们可以使用程序自动对录音的某一部分进行置乱吗？
我们当然可以。这将取决于你想让它变得多复杂

虽然有复杂的方法，但从一个非常基本的概念角度来看，我们需要在以下阶段获取录制的音频文件并进行处理

在音频文件中拆分单词：这将需要静音识别
字里行间

通过语音识别系统传递每个单词

想出一个方法来搅乱。你想保持沉默吗，

jumble

，用

白噪声

或

编码

填充

将识别出的单词与要加扰的单词进行比较，如果存在基于所选方法的匹配

置乱

按正确的顺序组合（

连接

）所有单词并存储它

我已经建立了一个基本的原型，除了（4）之外，它可以完成上面的工作。该程序大量利用了音频，从而提供了更简单的音频处理方法。可以找到关于这方面的教程

这个节目基本上,

1）我将数字

的开源预录wav文件下载到

并使用

pydub

将它们连接起来
该程序将给定的音频文件在一秒钟内分片。我过去常常把每个单词分开，这样它们就可以放在一秒钟的时间内。在现实生活中，情况并非如此

2）然后，它将单词传递给用户，并显示已识别的单词。正如您将看到的，单词

six

无法正确识别。为此，您需要一个强大的

语音识别引擎
3） 该程序提供三种不同的加扰方法

a） 倒装
b） 用等效的白噪声替换单词
c） 将单词替换为静音

4） 然后选择三个单词9
、4
和2
，并应用上述scramble
方法并替换相应的单词文件
5） 然后，它将所有的字和加扰的字按适当的顺序连接起来，并创建输出文件
注意：我没有足够的时间来添加要加扰的单词和已识别单词之间的比较
如果有任何问题，请告诉我
****演示代码：****
""" Declarations """ 
import speech_recognition as sr
from pydub import AudioSegment
from pydub.silence import split_on_silence
from pydub.generators import WhiteNoise
from pydub.playback import play



""" Function for Speech Recognition """ 
def processAudio(WAV_FILE):
    r = sr.Recognizer()
    with sr.WavFile(WAV_FILE) as source:
        audio = r.record(source) # read the entire WAV file

    # recognize speech using Google Speech Recognition
    try:  
        print("recognizedWord=" + r.recognize_google(audio))
    except sr.UnknownValueError:
        print("Could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from GSR; {0}".format(e))

""" Function to scramble word based upon choice """ 
def scramble_audio(aWord, option):
    scramble_file = export_path + "slice" + str(aWord) +".wav"
    scramble_audioseg = AudioSegment.from_wav(scramble_file)    
    aWord_length = scramble_audioseg.__len__() #Get length of word segment to scramble

    if option == "reverse":     #Reverse word to scramble
        scrambled_word = scramble_audioseg.reverse()        

    elif option == "whiteNoise":    #Replace word to scramble with white noise     
        wn = WhiteNoise()           #Instantiate White Noise Object         
        aWord_length = scramble_audioseg.__len__()              #Get length of word segment
        scrambled_word = wn.to_audio_segment(duration=aWord_length) #Create audio_segment

    elif option == "silence":               #Replace word to scramble with silence
        scrambled_word =  AudioSegment.silent(duration=aWord_length) 

    print ("Scrambling and Exporting %s" % scramble_file)
    scrambled_word.export(scramble_file, format="wav") #Export merged audio file


if __name__ == "__main__":

    export_path = ".//splitAudio//"
    in_audio_file = "0-10.wav"
    out_audio_file = export_path + "scrambledAudio.wav"

    #Read main audio file to be processed. Assuming in the same folder as this script
    sound = AudioSegment.from_wav(in_audio_file)

    sec2_splice = 1  #Splice threshold in sec

    audio_length = len(sound) # Total Audio Length In millisec

    q, r = divmod(audio_length, sec2_splice) #Get quotient and remainder 

    #Get total segments and rounds to next greater integer 
    total_segments=  (q + int(bool(r)) ) / 1000  #Converting to sec

    #Iterate through slices every one second and export
    print ("")
    n=0
    while n <= total_segments:
        print ("Making slice  from %d to %d  (sec)" % (n , sec2_splice ))    
        temp_object = sound[ (n * 1000) : (sec2_splice * 1000)] #Slicing is done in millisec
        myaudio_file = export_path + "slice" + str(n) +".wav"
        temp_object.export(myaudio_file , format="wav") 
        print ("Trying to recognize %d " %n)
        processAudio(myaudio_file)   
        n = sec2_splice
        sec2_splice += 1    


    #Scramble desired audio slice
    print ("")
    scramble_word = 9
    scramble_audio(scramble_word, "reverse" )

    scramble_word = 4
    scramble_audio(scramble_word, "whiteNoise" )

    scramble_word = 2
    scramble_audio(scramble_word, "silence" )
    #Combine modified audio

    final_audio = AudioSegment.empty()  #Create empty  AudioSegment
    print ("")
    i = 0
    while i <= total_segments:
        temp_audio_file = export_path + "slice" + str(i) +".wav"
        temp_audio_seg = AudioSegment.from_wav(temp_audio_file)
        print ("Combining %s"  % temp_audio_file )
        final_audio = final_audio.append(temp_audio_seg, crossfade=0)
        i += 1

    print ("Exporting final audio %s"  % out_audio_file )
    final_audio.export(out_audio_file , format="wav")

虽然您的问题并没有要求具体的编程相关问题，但我将尝试回答它，因为我正在从事类似的工作
我们可以使用程序自动对录音的某一部分进行置乱吗？

我们当然可以。这将取决于你想让它变得多复杂
虽然有复杂的方法，但从一个非常基本的概念角度来看，我们需要在以下阶段获取录制的音频文件并进行处理
在音频文件中拆分单词：这将需要静音识别

字里行间
通过语音识别系统传递每个单词
想出一个方法来搅乱。你想保持沉默吗，
jumble
，用白噪声
或编码
填充
将识别出的单词与要加扰的单词进行比较，
如果存在基于所选方法的匹配置乱
按正确的顺序组合（连接）所有单词并存储
它
我已经建立了一个基本的原型，除了（4）之外，它可以完成上面的工作。
该程序大量利用了音频，从而提供了更简单的音频处理方法。可以找到关于这方面的教程
这个节目基本上,
1） 我将数字0
的开源预录wav文件下载到10
并使用pydub
将它们连接起来

该程序将给定的音频文件在一秒钟内分片。我过去常常把每个单词分开，这样它们就可以放在一秒钟的时间内。在现实生活中，情况并非如此
2） 然后，它将单词传递给用户，并显示已识别的单词。正如您将看到的，单词six
无法正确识别。为此，您需要一个强大的语音识别引擎
3） 该程序提供三种不同的加扰方法

a） 倒装
b） 用等效的白噪声替换单词
c） 将单词替换为静音

4） 然后选择三个单词9
、4
和2
，并应用上述scramble
方法并替换相应的单词文件
5） 然后，它将所有的字和加扰的字按适当的顺序连接起来，并创建输出文件
注意：我没有足够的时间来添加要加扰的单词和已识别单词之间的比较
如果有任何问题，请告诉我
****演示代码：****
""" Declarations """ 
import speech_recognition as sr
from pydub import AudioSegment
from pydub.silence import split_on_silence
from pydub.generators import WhiteNoise
from pydub.playback import play



""" Function for Speech Recognition """ 
def processAudio(WAV_FILE):
    r = sr.Recognizer()
    with sr.WavFile(WAV_FILE) as source:
        audio = r.record(source) # read the entire WAV file

    # recognize speech using Google Speech Recognition
    try:  
        print("recognizedWord=" + r.recognize_google(audio))
    except sr.UnknownValueError:
        print("Could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from GSR; {0}".format(e))

""" Function to scramble word based upon choice """ 
def scramble_audio(aWord, option):
    scramble_file = export_path + "slice" + str(aWord) +".wav"
    scramble_audioseg = AudioSegment.from_wav(scramble_file)    
    aWord_length = scramble_audioseg.__len__() #Get length of word segment to scramble

    if option == "reverse":     #Reverse word to scramble
        scrambled_word = scramble_audioseg.reverse()        

    elif option == "whiteNoise":    #Replace word to scramble with white noise     
        wn = WhiteNoise()           #Instantiate White Noise Object         
        aWord_length = scramble_audioseg.__len__()              #Get length of word segment
        scrambled_word = wn.to_audio_segment(duration=aWord_length) #Create audio_segment

    elif option == "silence":               #Replace word to scramble with silence
        scrambled_word =  AudioSegment.silent(duration=aWord_length) 

    print ("Scrambling and Exporting %s" % scramble_file)
    scrambled_word.export(scramble_file, format="wav") #Export merged audio file


if __name__ == "__main__":

    export_path = ".//splitAudio//"
    in_audio_file = "0-10.wav"
    out_audio_file = export_path + "scrambledAudio.wav"

    #Read main audio file to be processed. Assuming in the same folder as this script
    sound = AudioSegment.from_wav(in_audio_file)

    sec2_splice = 1  #Splice threshold in sec

    audio_length = len(sound) # Total Audio Length In millisec

    q, r = divmod(audio_length, sec2_splice) #Get quotient and remainder 

    #Get total segments and rounds to next greater integer 
    total_segments=  (q + int(bool(r)) ) / 1000  #Converting to sec

    #Iterate through slices every one second and export
    print ("")
    n=0
    while n <= total_segments:
        print ("Making slice  from %d to %d  (sec)" % (n , sec2_splice ))    
        temp_object = sound[ (n * 1000) : (sec2_splice * 1000)] #Slicing is done in millisec
        myaudio_file = export_path + "slice" + str(n) +".wav"
        temp_object.export(myaudio_file , format="wav") 
        print ("Trying to recognize %d " %n)
        processAudio(myaudio_file)   
        n = sec2_splice
        sec2_splice += 1    


    #Scramble desired audio slice
    print ("")
    scramble_word = 9
    scramble_audio(scramble_word, "reverse" )

    scramble_word = 4
    scramble_audio(scramble_word, "whiteNoise" )

    scramble_word = 2
    scramble_audio(scramble_word, "silence" )
    #Combine modified audio

    final_audio = AudioSegment.empty()  #Create empty  AudioSegment
    print ("")
    i = 0
    while i <= total_segments:
        temp_audio_file = export_path + "slice" + str(i) +".wav"
        temp_audio_seg = AudioSegment.from_wav(temp_audio_file)
        print ("Combining %s"  % temp_audio_file )
        final_audio = final_audio.append(temp_audio_seg, crossfade=0)
        i += 1

    print ("Exporting final audio %s"  % out_audio_file )
    final_audio.export(out_audio_file , format="wav")

嗨，贾斯汀，这个网站我