Python 音频数据字符串格式到numpy数组_Python_String_Numpy_Audio_Format Conversion

Python 音频数据字符串格式到numpy数组

python string numpy audio

Python 音频数据字符串格式到numpy数组,python,string,numpy,audio,format-conversion,Python,String,Numpy,Audio,Format Conversion,我正在尝试转换numpy.array的音频采样率（从44100到22050），其中包含88200个采样，我已经在其中完成了一些处理（例如添加静音和转换为单声道）。我试图用audioop.ratecv转换这个数组，但它返回一个str而不是numpy数组，当我用scipy.io.wavfile.write写入这些数据时，结果是一半的数据丢失，音频速度是原来的两倍（而不是更慢，至少这有点道理）。 audio.ratecv可以很好地处理str数组，例如wave.open返回，但我不知道如何处理它们，所以

我正在尝试转换numpy.array的音频采样率（从44100到22050），其中包含88200个采样，我已经在其中完成了一些处理（例如添加静音和转换为单声道）。我试图用

audioop.ratecv

转换这个数组，但它返回一个str而不是numpy数组，当我用

scipy.io.wavfile.write

写入这些数据时，结果是一半的数据丢失，音频速度是原来的两倍（而不是更慢，至少这有点道理）。

audio.ratecv

可以很好地处理str数组，例如

wave.open

返回，但我不知道如何处理它们，所以我尝试使用

numpy.array2string（数据）

将str转换为numpy，以将其传递给ratecv并获得正确的结果，然后使用

numpy.fromstring（数据，dtype）再次转换为numpy

现在数据的长度是8个样本。我认为这是由于格式的复杂性，但我不知道如何控制它。我还没有弄清楚str使用的是哪种格式

wave.open

返回，这样我就可以强制使用这种格式了

这是我代码的这一部分

def conv_sr(data, srold, fixSR, dType, chan = 1): 
    state = None
    width = 2 # numpy.int16
    print "data shape", data.shape, type(data[0]) # returns shape 88200, type int16
    fragments = numpy.array2string(data)
    print "new fragments len", len(fragments), "type", type(fragments) # return len 30 type str
    fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
    print "fragments", len(fragments_new), type(fragments_new[0]) # returns 16, type str
    data_to_return = numpy.fromstring(fragments_new, dtype=dType)
    return data_to_return

我这样称呼它

data1 = numpy.array(data1, dtype=dType)
data_to_copy = numpy.append(data1, data2)
data_to_copy = _to_copy.sum(axis = 1) / chan
data_to_copy = data_to_copy.flatten() # because its mono

data_to_copy = conv_sr(data_to_copy, sr, fixSR, dType) #sr = 44100, fixSR = 22050

scipy.io.wavfile.write(filename, fixSR, data_to_copy)

经过更多的研究，我发现了我的错误，16位音频似乎是由两个8位的“单元”组成的，所以我放的数据类型是错误的，这就是为什么我有音频速度问题。我找到了正确的数据类型。因此，在conv_sr def中，我传递一个numpy数组，将其转换为数据字符串，传递它以转换采样率，再次将

scipy.io.wavfile.write

的numpy数组转换为16位格式

def widthFinder(dType):
    try:
        b = str(dType)
        bits = int(b[-2:])
    except:
        b = str(dType)
        bits = int(b[-1:])
    width = bits/8
    return width

def conv_sr(data, srold, fixSR, dType, chan = 1): 
    state = None
    width = widthFinder(dType)
    if width != 1 and width != 2 and width != 4:
        width = 2
    fragments = data.tobytes()
    fragments_new, state = audioop.ratecv(fragments, width, chan, srold, fixSR, state)
    fragments_dtype = numpy.dtype((numpy.int16, {'x':(numpy.int8,0), 'y':(numpy.int8,1)}))
    data_to_return = numpy.fromstring(fragments_new, dtype=fragments_dtype)
    data_to_return = data_to_return.astype(dType)
    return data_to_return

如果您发现任何错误，请随时纠正我，我仍然是一个学习者