Keras 使用LSTM和k倍交叉验证时损失值为NaN?
我正在尝试对我的LSTM神经网络进行k倍交叉验证。但是,每当我尝试运行此命令时,我的代码都会运行,但由于我的网络正在通过训练数据运行,因此我得到的损失是Keras 使用LSTM和k倍交叉验证时损失值为NaN?,keras,neural-network,lstm,recurrent-neural-network,loss-function,Keras,Neural Network,Lstm,Recurrent Neural Network,Loss Function,我正在尝试对我的LSTM神经网络进行k倍交叉验证。但是,每当我尝试运行此命令时,我的代码都会运行,但由于我的网络正在通过训练数据运行,因此我得到的损失是nan。在实施k-fold交叉验证之前,我没有这个问题。这是我的k倍CV循环: # Define the k-fold cross validator k_fold = KFold(n_splits = 5, shuffle = True) cv_scores_accuracy = [] cv_scores_loss = [] # K-fold
nan
。在实施k-fold交叉验证之前,我没有这个问题。这是我的k倍CV循环:
# Define the k-fold cross validator
k_fold = KFold(n_splits = 5, shuffle = True)
cv_scores_accuracy = []
cv_scores_loss = []
# K-fold cross validation model evaluation
fold_no = 1
for train, test in k_fold.split(training_X, training_y_encoded):
# From previous runs, we have found that dropout_rate = 0.1, l1 = 2**-6, and l2 = 2**-8 gave optimal results.
model = define_LSTM_model(training_X, training_y_encoded, 0.1, 2**-6, 2**-8)
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam',
metrics = ['accuracy'])
print('--------------------------------------------------------------------------------------------------------')
print(f'Training for fold {fold_no}...')
# Fit the model
history = model.fit(training_X[train], training_y_encoded[train], epochs = 15, batch_size = 128,
class_weight = label_weights, verbose = 1)
# Evaluate the model, generate metrics
scores = model.evaluate(training_X[test], training_y_encoded[test], batch_size = 128, verbose = 0)
print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {score[0]}; {model.metrics_names[1]} of {score[1]*100}%;')
cv_scores_accuracy.append(scores[1] * 100)
cv_scores_loss.append(scores[0])
# Increment the fold number
fold_no += 1
下面是我的LSTM模型的定义:
def define_LSTM_model(training_X, training_y, dropout_rate, l1_value, l2_value):
n_timesteps, n_features, n_outputs = training_X.shape[1], training_X.shape[2], training_y.shape[1]
model = Sequential()
model.add(LSTM(units = 75, kernel_regularizer = regularizers.l1_l2(l1 = l1_value, l2 = l2_value), input_shape = (n_timesteps, n_features)))
model.add(Dropout(rate = dropout_rate))
model.add(Dense(units = 75, activation = 'tanh'))
model.add(Dense(units = n_outputs, activation = 'softmax'))
return model
我看过其他建议,比如将批处理大小从64增加到128,我觉得我的神经网络中的层设置正确。但是,每次运行此代码时,我都会得到以下信息:
Training for fold 1...
Epoch 1/15
6400/267634 [..............................] - ETA: 4:52 - loss: nan - acc: 0.7489