Neural network 预期lstm_1具有形状(20256),但获得具有形状(1,76)的数组
我正在为说话人识别建立一个神经网络,我在尺寸方面有问题,我一定是在批处理生成器中出错了,但我不知道是什么。我的步骤如下。首先,我准备标签:Neural network 预期lstm_1具有形状(20256),但获得具有形状(1,76)的数组,neural-network,voice-recognition,voice,speaker,Neural Network,Voice Recognition,Voice,Speaker,我正在为说话人识别建立一个神经网络,我在尺寸方面有问题,我一定是在批处理生成器中出错了,但我不知道是什么。我的步骤如下。首先,我准备标签: labels = [] with open('filtered_files.csv', 'r') as csvfile: reader = csv.reader(csvfile) for file in reader: label = file[0] if label not in labels:
labels = []
with open('filtered_files.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
for file in reader:
label = file[0]
if label not in labels:
labels.append(label)
print(labels)
然后我声明batch_生成器:
n_features = 20
max_length = 1000
n_classes = len(labels)
def batch_generator(data, batch_size=16):
while 1:
random.shuffle(data)
X, y = [], []
for i in range(batch_size):
print(i)
wav = data[i]
waves, sr = librosa.load(wav, mono=True)
print(waves)
filename = wav.split('\\')[1]
filename = filename.split('.')[0] + ".mp3"
filename = filename.split('_', 1)[1]
print(filename)
with open('filtered_files.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
for file in reader:
if filename == file[1]:
print(file[0])
label = file[0]
break
else:
continue
y.append(one_hot_encode(["'" + label + "'"]))
mfcc = librosa.feature.mfcc(waves, sr)
mfcc = np.pad(mfcc, ((0,0), (0, max_length - len(mfcc[0]))), mode='constant', constant_values=0)
X.append(np.array(mfcc))
yield np.array(X), np.array(y)
最后,我有了神经网络声明,我开始了培训过程:
learning_rate = 0.001
batch_size = 64
n_epochs = 50
dropout = 0.5
input_shape = (n_features, max_length)
steps_per_epoch = 50
model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=input_shape,
dropout=dropout))
# model.add(Flatten())
# model.add(Dense(128, activation='relu'))
# model.add(Dropout(dropout))
# model.add(Dense(n_classes, activation='softmax'))
opt = Adam(lr=learning_rate)
model.compile(loss='categorical_crossentropy', optimizer=opt,
metrics=['accuracy'])
model.summary()
history = model.fit_generator(
generator=batch_generator(X_train, batch_size),
steps_per_epoch=steps_per_epoch,
epochs=n_epochs,
verbose=1,
validation_data=batch_generator(X_val, 32),
validation_steps=5,
callbacks=callbacks
)
我放了很多代码,因为我不确定到底是哪个部分导致了错误的维度。第一层的格式有以下问题:
,检查目标时出错:预期lstm_1具有形状(20256),但获得具有形状(1,76)的数组“
如果取消对第二层的注释,则会收到:
,检查目标时出错:预期展平\u 1有2个维度,但得到了具有形状的数组(64、1、76)。“模型inputShape和数据集形状之间存在形状不匹配。如错误所示,数据集的形状为(1,76),而模型的形状为(20256)(
input\u shape=(n\u features,max\u length)
)
要解决此问题,将更改模型inputShape以匹配数据集的输入,或者处理数据集以匹配模型inputShape
input_shape=(20256)
模型=顺序()
添加(LSTM(256,返回序列=真,输入形状=输入形状,
辍学=辍学)
model.add(展平())
#添加(密集(128,activation='relu'))
#型号.添加(辍学(辍学))
model.add(密集(2,activation='softmax'))
opt=Adam(lr=学习率)
compile(loss='classifical\u crossentropy',optimizer=opt,
指标=[‘准确度’])
model.summary()
model.fit(tf.ones([1,20256]),tf.one_hot([0,1,2])//一个训练示例
模型inputShape和数据集形状之间存在形状不匹配。如错误所示,数据集的形状为(1,76),而模型的形状为(20256)(input\u shape=(n\u features,max\u length)
)
要解决此问题,将更改模型inputShape以匹配数据集的输入,或者处理数据集以匹配模型inputShape
input_shape=(20256)
模型=顺序()
添加(LSTM(256,返回序列=真,输入形状=输入形状,
辍学=辍学)
model.add(展平())
#添加(密集(128,activation='relu'))
#型号.添加(辍学(辍学))
model.add(密集(2,activation='softmax'))
opt=Adam(lr=学习率)
compile(loss='classifical\u crossentropy',optimizer=opt,
指标=[‘准确度’])
model.summary()
model.fit(tf.ones([1,20256]),tf.one_hot([0,1,2])//一个训练示例
该问题与数据集维度和模型输入形状不匹配有关该问题与数据集维度和模型输入形状不匹配有关