Python ParameterError:Mono数据必须具有形状（示例，）。接收形状=（187488721）_Python_Machine Learning_Pyaudio_Mono Embedding

Python ParameterError:Mono数据必须具有形状（示例，）。接收形状=（187488721）

python machine-learning

Python ParameterError:Mono数据必须具有形状（示例，）。接收形状=（187488721）,python,machine-learning,pyaudio,mono-embedding,Python,Machine Learning,Pyaudio,Mono Embedding,目前，我正在python上进行说话人日志化，使用pyannote进行嵌入。我的嵌入函数如下所示： import torch import librosa from pyannote.core import Segment def embeddings_(audio_path,resegmented,range): model_emb = torch.hub.load('pyannote/pyannote-audio', 'emb') embedding = model_emb(

目前，我正在python上进行说话人日志化，使用pyannote进行嵌入。我的嵌入函数如下所示：

import torch
import librosa
from pyannote.core import Segment

def embeddings_(audio_path,resegmented,range):
  model_emb = torch.hub.load('pyannote/pyannote-audio', 'emb')
  
  embedding = model_emb({'audio': audio_path})
  for window, emb in embedding:
    assert isinstance(window, Segment)
    assert isinstance(emb, np.ndarray)

  y, sr = librosa.load(audio_path)
  myDict={}
  myDict['audio'] = audio_path
  myDict['duration'] = len(y)/sr

  data=[]
  for i in resegmented:
    excerpt = Segment(start=i[0], end=i[0]+range)
    emb = model_emb.crop(myDict,excerpt)
    data.append(emb.T)
  data= np.asarray(data)
  
  return data.reshape(len(data),512)

当我跑的时候

embeddings = embeddings_(audiofile,resegmented,2)

我得到这个错误：

ParameterError: Mono data must have shape (samples,). Received shape=(1, 87488721)

我也犯了同样的错误，但我找到了解决办法。对我来说，在“pyannote/audio/features/utils.py”中，当它试图使用这一行对音频进行重新采样时，触发了错误

y=librosa.core.resample（y.T，sample\u rate，self.sample\u rate）。T

这是我的解决办法

    def get_features(self, y, sample_rate):

        # convert to mono
        if self.mono:
            y = np.mean(y, axis=1, keepdims=True)
            y = np.squeeze(y)    # Add this line
        
        # resample if sample rates mismatch
        if (self.sample_rate is not None) and (self.sample_rate != sample_rate):
            y = librosa.core.resample(y.T, sample_rate, self.sample_rate).T
            sample_rate = self.sample_rate

        # augment data
        if self.augmentation is not None:
            y = self.augmentation(y, sample_rate)

        # TODO: how time consuming is this thing (needs profiling...)
        if len(y.shape) == 1:     # Add this line
            y = y[:,np.newaxis]   # Add this line
            
        try:
            valid = valid_audio(y[:, 0], mono=True)
        except ParameterError as e:
            msg = f"Something went wrong when augmenting waveform."
            raise ValueError(msg)

        return y

对于

librosa.core.resample

，使用

np.square

上的

np.square

，然后使用

y[：，np.newaxis]

将其形状更改为（samples，1）

valid=valid\u audio（y[：，0]，mono=True）