Keras 为什么LSTM不'；t预测回归中的低值和高值？_Keras_Regression_Lstm_Loss Function_Activation Function

Keras 为什么LSTM不'；t预测回归中的低值和高值？

keras

Keras 为什么LSTM不'；t预测回归中的低值和高值？,keras,regression,lstm,loss-function,activation-function,Keras,Regression,Lstm,Loss Function,Activation Function,为了预测0到2之间的连续值，我用swish激活函数制作了一个双向LSTM层堆栈，然后是4个致密/脱落层。我使用均方误差损失函数、Adam优化器和32的批大小编译了模型我遇到的问题是，网络无法很好地预测极值（低值和高值），不知何故，最小预测值为0.16，最大值为1.85。见下图（位于（2,2）坐标处的点不是LSTM预测的一部分）：我想改进尾部的预测，即能够得到低于0.16和高于1.85的预测下面是我建立的网络：一个输入数据由6个表示位移和距离的值组成 num_features = 6

为了预测0到2之间的连续值，我用swish激活函数制作了一个双向LSTM层堆栈，然后是4个致密/脱落层。我使用均方误差损失函数、Adam优化器和32的批大小编译了模型

我遇到的问题是，网络无法很好地预测极值（低值和高值），不知何故，最小预测值为0.16，最大值为1.85。见下图（位于（2,2）坐标处的点不是LSTM预测的一部分）：

我想改进尾部的预测，即能够得到低于0.16和高于1.85的预测

下面是我建立的网络：

一个输入数据由6个表示位移和距离的值组成

num_features = 6

MODEL = Sequential()
MODEL.add(Bidirectional(LSTM(2**6, return_sequences=True, dropout=0.1),
                        input_shape=(None, num_features), merge_mode='concat'))
MODEL.add(Bidirectional(LSTM(2**5, return_sequences=True, dropout=0.1),
                        input_shape=(None, num_features), merge_mode='concat'))
MODEL.add(Bidirectional(LSTM(2**4, return_sequences=False, dropout=0.1),
                        input_shape=(None, num_features), merge_mode='concat'))

MODEL.add(Dense(2**5, activation='swish'))
MODEL.add(Dropout(0.2))
MODEL.add(Dense(2**4, activation='swish'))
MODEL.add(Dropout(0.2))
MODEL.add(Dense(2**3, activation='swish'))
MODEL.add(Dropout(0.1))
MODEL.add(Dense(2**2, activation='swish'))
MODEL.add(Dropout(0.1))
MODEL.add(Dense(1, activation='swish'))

MODEL.compile(loss='mean_squared_error', optimizer='adam', metrics=['MAE', 'accuracy'])

CALLBACKS = [
    EarlyStopping(monitor='val_loss', patience=20, min_delta=1e-4, verbose=1),
    ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=4, min_lr=1e-9, verbose=1),
    ModelCheckpoint(filepath='stackLSTM_best_model.h5', monitor='val_loss', save_best_only=True, verbose=1)
]

HISTORY = MODEL.fit(x=TRAIN_SET['feature'],
                    y=TRAIN_SET['label'],
                    epochs=1000,
                    callbacks=CALLBACKS,
                    batch_size=2**5,
                    shuffle=True,
                    validation_data=(VAL_SET['feature'], VAL_SET['label']),
                    verbose=2)

这里有人有类似的问题吗？可能是什么问题？我尝试将标签从0缩放到1，而不是0-2，并尝试了其他激活功能（sigmoid、linenar），但没有任何帮助。在这种情况下可能会使用特定的损失函数或优化器吗

谢谢你的帮助

更新： 我移除了倒数第二个致密层，并将激活函数更改为ELU，我获得了极限值，但总体预测结果出乎意料地不如之前的结果：

MODEL = Sequential()
MODEL.add(Bidirectional(LSTM(2**6, return_sequences=True, dropout=0.1),
                        input_shape=(None, num_features), merge_mode='concat'))
MODEL.add(Bidirectional(LSTM(2**5, return_sequences=True, dropout=0.1),
                        input_shape=(None, num_features), merge_mode='concat'))
MODEL.add(Bidirectional(LSTM(2**4, return_sequences=False, dropout=0.1),
                        input_shape=(None, num_features), merge_mode='concat'))
MODEL.add(Dense(2**5, activation='elu'))
MODEL.add(Dropout(0.2))
MODEL.add(Dense(2**4, activation='elu'))
MODEL.add(Dropout(0.2))
MODEL.add(Dense(2**3, activation='elu'))
MODEL.add(Dropout(0.1))
# MODEL.add(Dense(2**2, activation='elu'))
# MODEL.add(Dropout(0.1))
MODEL.add(Dense(1, activation=None))

您能否添加与模型定义、编译和安装位置对应的代码摘录？此外，共享数据集输入和输出的摘录（如果可能）。如果数据集是由您生成的，请提供相应的代码。谢谢。嗨@Chicodelarosa，我用代码和最后的结果更新了我的帖子。