Python LSTM多元时间序列预测的重缩放和逆_变换不返回原始尺度中的预测值和实际值_Python_Tensorflow_Keras_Time Series_Lstm

Python LSTM多元时间序列预测的重缩放和逆_变换不返回原始尺度中的预测值和实际值

python tensorflow keras

Python LSTM多元时间序列预测的重缩放和逆_变换不返回原始尺度中的预测值和实际值,python,tensorflow,keras,time-series,lstm,Python,Tensorflow,Keras,Time Series,Lstm,我目前正在构建一个LSTM多变量时间序列模型，以使用前一个时间戳（t-1）中的22个特征作为输入来预测当前时间（t）的一个输出。我一直在遵循的指示，一切似乎都正常工作，但我有一个问题涉及到最终的逆_transform（）函数具体来说，我希望，一旦我反转了预测值和实际值的缩放比例，在我规范化和转换数据集之前，输出将与原始数据使用相同的单位。我试图预测的变量的平均值约为29.186，但当我运行inverse_transform（）并根据实际值绘制预测时，结果的输出范围为3.30-3.55 可能是我

我目前正在构建一个LSTM多变量时间序列模型，以使用前一个时间戳（t-1）中的22个特征作为输入来预测当前时间（t）的一个输出。我一直在遵循的指示，一切似乎都正常工作，但我有一个问题涉及到最终的逆_transform（）函数

具体来说，我希望，一旦我反转了预测值和实际值的缩放比例，在我规范化和转换数据集之前，输出将与原始数据使用相同的单位。我试图预测的变量的平均值约为29.186，但当我运行inverse_transform（）并根据实际值绘制预测时，结果的输出范围为3.30-3.55

可能是我在切片数组时犯了一个简单的错误，也可能是完全不同的原因，但我似乎无法确定为什么我的预测值和实际值与转换所有数据之前的数据单位不同的根本原因

对于其他上下文，这里是train_X.shape、train_y.shape、test_X.shape、test_y.shape的打印输出：

(1804950, 1, 22) (1804950,) (849389, 1, 22) (849389,)

以下是我的代码的相关部分：

# ensure all data is float
values = ips_data.values.astype('float32')

# normalize features
scaler = MinMaxScaler(feature_range=(0, 1))
scaled = scaler.fit_transform(values)

# frame as supervised learning
ips_reframed = series_to_supervised(scaled, 1, 1)

# drop a bunch of columns I don't want to predict
ips_reframed.drop(ips_reframed.columns[[22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,42,43]], axis=1, inplace=True)
# print(ips_reframed.head())

# split into train and test sets
values = ips_reframed.values
# n_train_hours = (365 * 24 * 60) * 3
n_train_hours = int(len(values) * 0.68)
train = values[:n_train_hours, :]
test = values[n_train_hours:, :]

# split into input and outputs
train_X, train_y = train[:, :-1], train[:, -1]
test_X, test_y = test[:, :-1], test[:, -1]

# reshape input to be 3D [samples, timesteps, features]
train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1]))
test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1]))
# print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

# design network
model = Sequential()
model.add(LSTM(50, input_shape=(train_X.shape[1], train_X.shape[2]), return_sequences=True))
model.add(Dropout(0.5))
model.add(LSTM(50))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation("linear"))
model.compile(loss='mae', optimizer='adam')

# fit network
history = model.fit(train_X, train_y, epochs=100, batch_size=1000, 
validation_data=(test_X, test_y), verbose=2, shuffle=False)

# plot history
plt.plot(history.history['loss'], label='train')
plt.plot(history.history['val_loss'], label='test')
plt.legend()
plt.show()

# make a prediction
yhat = model.predict(test_X)

# invert scaling for forecast
test_X = test_X.reshape((test_X.shape[0], test_X.shape[2]))
inv_yhat = concatenate((yhat, test_X[:, 1:]), axis=1)
inv_yhat = scaler.inverse_transform(inv_yhat)
inv_yhat = inv_yhat[:,0]

# invert scaling for actual
test_y = test_y.reshape((len(test_y), 1))
inv_y = concatenate((test_y, test_X[:, 1:]), axis=1)
inv_y = scaler.inverse_transform(inv_y)
inv_y = inv_y[:,0]

# calculate RMSE
rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
# print('Test RMSE: %.3f' % rmse)

# plotting the predictions
plt.plot(inv_yhat[-100:], label='predictions')
plt.plot(inv_y[-100:], label='actual')
plt.title("Prediction vs. Actual")
plt.legend()
plt.show()

如果您需要任何其他信息或上下文，请告诉我，并感谢您的帮助