Keras 使用LSTM和k倍交叉验证时损失值为NaN？_Keras_Neural Network_Lstm_Recurrent Neural Network_Loss Function

Keras 使用LSTM和k倍交叉验证时损失值为NaN？

keras neural-network

Keras 使用LSTM和k倍交叉验证时损失值为NaN？,keras,neural-network,lstm,recurrent-neural-network,loss-function,Keras,Neural Network,Lstm,Recurrent Neural Network,Loss Function,我正在尝试对我的LSTM神经网络进行k倍交叉验证。但是，每当我尝试运行此命令时，我的代码都会运行，但由于我的网络正在通过训练数据运行，因此我得到的损失是nan。在实施k-fold交叉验证之前，我没有这个问题。这是我的k倍CV循环： # Define the k-fold cross validator k_fold = KFold(n_splits = 5, shuffle = True) cv_scores_accuracy = [] cv_scores_loss = [] # K-fold

我正在尝试对我的LSTM神经网络进行k倍交叉验证。但是，每当我尝试运行此命令时，我的代码都会运行，但由于我的网络正在通过训练数据运行，因此我得到的损失是

nan

。在实施k-fold交叉验证之前，我没有这个问题。这是我的k倍CV循环：

# Define the k-fold cross validator
k_fold = KFold(n_splits = 5, shuffle = True)
cv_scores_accuracy = []
cv_scores_loss = []
# K-fold cross validation  model evaluation
fold_no = 1
for train, test in k_fold.split(training_X, training_y_encoded):
    # From previous runs, we have found that dropout_rate = 0.1, l1 = 2**-6, and l2 = 2**-8 gave optimal results.
    model = define_LSTM_model(training_X, training_y_encoded, 0.1, 2**-6, 2**-8)
    model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', 
                  metrics = ['accuracy'])
    print('--------------------------------------------------------------------------------------------------------')
    print(f'Training for fold {fold_no}...')
    # Fit the model
    history = model.fit(training_X[train], training_y_encoded[train], epochs = 15, batch_size = 128, 
                        class_weight = label_weights, verbose = 1)
    # Evaluate the model, generate metrics
    scores = model.evaluate(training_X[test], training_y_encoded[test], batch_size = 128, verbose = 0)
    print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {score[0]}; {model.metrics_names[1]} of {score[1]*100}%;')
    cv_scores_accuracy.append(scores[1] * 100)
    cv_scores_loss.append(scores[0])
    # Increment the fold number
    fold_no += 1

下面是我的LSTM模型的定义：

def define_LSTM_model(training_X, training_y, dropout_rate, l1_value, l2_value):
    n_timesteps, n_features, n_outputs = training_X.shape[1], training_X.shape[2], training_y.shape[1]
    model = Sequential()
    model.add(LSTM(units = 75, kernel_regularizer = regularizers.l1_l2(l1 = l1_value, l2 = l2_value), input_shape = (n_timesteps, n_features)))
    model.add(Dropout(rate = dropout_rate))
    model.add(Dense(units = 75, activation = 'tanh'))
    model.add(Dense(units = n_outputs, activation = 'softmax'))
    return model

我看过其他建议，比如将批处理大小从64增加到128，我觉得我的神经网络中的层设置正确。但是，每次运行此代码时，我都会得到以下信息：

Training for fold 1...
Epoch 1/15
  6400/267634 [..............................] - ETA: 4:52 - loss: nan - acc: 0.7489