Python 层顺序的输入与层不兼容：LSTM中的形状错误_Python_Keras_Neural Network_Lstm_Recurrent Neural Network

Python 层顺序的输入与层不兼容：LSTM中的形状错误

python keras neural-network

Python 层顺序的输入与层不兼容：LSTM中的形状错误,python,keras,neural-network,lstm,recurrent-neural-network,Python,Keras,Neural Network,Lstm,Recurrent Neural Network,我是神经网络的新手，我想用它们与其他机器学习方法进行比较。我有一个大约两年的多元时间序列数据。我想使用LSTM根据其他变量预测未来几天的“y”。我的数据的最后一天是2020-07-31 df.tail() y holidays day_of_month day_of_week month quarter Date 2020-07-27 32500 0 27

我是神经网络的新手，我想用它们与其他机器学习方法进行比较。我有一个大约两年的多元时间序列数据。我想使用LSTM根据其他变量预测未来几天的“y”。我的数据的最后一天是2020-07-31

df.tail()

              y   holidays  day_of_month    day_of_week month   quarter
   Date                     
 2020-07-27 32500      0      27                 0        7        3
 2020-07-28 33280      0      28                 1        7        3
 2020-07-29 31110      0      29                 2        7        3
 2020-07-30 37720      0      30                 3        7        3
 2020-07-31 32240      0      31                 4        7        3

为了训练LSTM模型，我还将数据分为训练数据和测试数据

from sklearn.model_selection import train_test_split
split_date = '2020-07-27' #to predict the next 4 days
df_train = df.loc[df.index <= split_date].copy()
df_test = df.loc[df.index > split_date].copy()
X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
y1=df_train['y']
X2=df_test[['day_of_month','day_of_week','month','quarter','holidays']]
y2=df_test['y']

X_train, y_train =X1, y1
X_test, y_test = X2,y2

现在，我们来看看最难的部分：模型

num_units=50
activation_function = 'sigmoid'
optimizer = 'adam'
loss_function = 'mean_squared_error'
batch_size = 10
num_epochs = 100

 # Initialize the RNN
regressor = Sequential()

 # Adding the input layer and the LSTM layer
regressor.add(LSTM(units = num_units, return_sequences=True ,activation = activation_function, 
input_shape=(X_train.shape[1], 1)))

 # Adding the output layer
regressor.add(Dense(units = 1))

 # Compiling the RNN
regressor.compile(optimizer = optimizer, loss = loss_function)

# Using the training set to train the model
regressor.fit(X_train_scaled, y_train, batch_size = batch_size, epochs = num_epochs)

但是，我收到以下错误：

ValueError: Input 0 of layer sequential_11 is incompatible with the layer: expected ndim=3, found 
ndim=2. Full shape received: [None, 5]

我不明白我们如何选择输入的参数或形状。我看过一些视频，阅读了一些Github页面，每个人似乎都以不同的方式运行LSTM，这使得它更难实现。前面的错误可能来自形状，但除此之外，其他所有错误都正确吗？我怎样才能解决这个问题？谢谢

编辑：类似的问题并不能解决我的问题。。我已经从那里尝试了解决方案

x_train = X_train_scaled.reshape(-1, 1, 5)
x_test  = X_test_scaled.reshape(-1, 1, 5)

（我的X_测试和y_测试只有一列）。而且这个解决方案似乎也不起作用。我现在得到这个错误：

ValueError: Input 0 is incompatible with layer sequential_22: expected shape= 
(None, None, 1), found shape=[None, 1, 5]

输入：

问题是，您的模型期望3D输入的形状

（批处理、序列、特征）

，但您的

X\u序列实际上是数据帧的一部分，因此2D数组：
X1=df_train[['day_of_month','day_of_week','month','quarter','holidays']]
X_train, y_train =X1, y1

我假设您的列应该是您的特性，所以您通常会做的是对df进行“堆栈切片”，这样您的X\u train
看起来像这样：
这是形状的虚拟2D数据集（15,5）
：
您可以对其进行重塑以添加批次维度，例如（15,1,5）
：
数据相同，但呈现方式不同。现在在这个例子中，batch=15
和sequence=1
，我不知道在你的例子中序列的长度是多少，但它可以是任何东西
型号：
现在，在您的模型中，keras
input\u shape
expect（批处理、序列、特征）
，当您通过以下步骤时：
input_shape=(X_train.shape[1], 1)

这就是您的模型所看到的：（无，序列=X\u train.shape[1]，num\u features=1）
无
用于批次维度。我不认为这是你试图做的，一旦你重塑了形状，你也应该纠正input\u shape
以匹配新数组。
这是一个多元回归问题，你正在使用LSTM解决。在开始编写代码之前，让我们先看看它的含义
问题陈述：

您在k
天内每天都有5个功能

对于任何一天n，考虑到最后“m”天的特征，您希望预测第n天的y

正在创建窗口数据集：

我们首先需要决定我们想要输入模型的天数。这称为序列长度（在本例中，让我们将其固定为3）
我们必须分割序列长度的天数来创建训练和测试数据集。这是通过使用滑动窗口完成的，其中窗口大小为序列长度
如您所见，最后的p
记录没有可用的预测，其中p
是序列长度
我们将使用timeseries\u dataset\u from\u array
方法创建窗口数据集
有关更多高级资料，请参见官方tf

LSTM模型
因此，我们希望达到的效果如下：

对于每个LSTM单元展开，我们传入当天的5个特征，并在m
时间中展开，其中m
是序列长度。我们正在预测最后一天的y

代码：
输出：
(7500, 6) (2500, 6)
Epoch 1/3
1874/1874 [==============================] - 8s 3ms/step - loss: 9974697.3664 - val_loss: 8242597.5000
Epoch 2/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8367530.7117 - val_loss: 8256667.0000
Epoch 3/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8379048.3237 - val_loss: 8233981.5000
<tensorflow.python.keras.callbacks.History at 0x7f3e94bdd198>

（7500,6）（2500,6）
纪元1/3
1874/1874[=============================================8s 3ms/步长-损耗：9974697.3664-val_损耗：8242597.5000
纪元2/3
1874/1874[================================================6s 3ms/步长-损耗：8367530.7117-瓦卢损耗：8256667.0000
纪元3/3
1874/1874[================================================-6s 3ms/步长-损耗：8379048.3237-损耗：8233981.5000
发生此错误的原因是您定义了一个模型体系结构，但是您的输入不适合该体系结构。作为一名程序员的经验法则，您确实需要注意错误消息，它说Full shape received:[None，5]
。之所以会发生这种情况，是因为您的输入是[None，5]
，因为X1=df\u train[[“月日”、“周日”、“月”、“季度”、“假日”]
。在定义连续层之前，请尝试定义一个输入层，例如层。输入（shape=（len（X1.columns））
。这是否回答了您的问题@尼古拉斯·热尔韦：谢谢，我已经编辑了我的问题。我不知道这有多大意义，但不是x_train=x_train\u scaled.重塑（-1,1,5）x_test=x_test\u scaled.重塑（-1,1,5）
，你可以做x_train=x_train\u scaled.重塑（-1,5,1）x_test=x_test\u scaled.重塑（-1,5,1）让它工作起来谢谢你的回答。但这有点让人困惑。提出问题时给出代码脚本的目的是使答案更简单，更适合脚本。您刚刚生成了一个零数组。不是很直观的tbh。@业余的血液学家，我无法访问您的数据进行演示，所以我能做的最好是提供一些建议来纠正代码中的一些误解。让我知道哪部分需要更多的澄清。谢谢，但是，我不明白为什么我不能使用回归器。在您的示例中，预测（test_y）仅适用于测试_数据集。我只想预测y（因为我会事先知道未来的其他变量）。另外，如何按照代码绘制结果（预测和真实）？
data = data[:,np.newaxis,:] 

array([[[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.]]])

input_shape=(X_train.shape[1], 1)

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

# Model
regressor =  models.Sequential()
regressor.add(layers.LSTM(5, return_sequences=True))
regressor.add(layers.Dense(1))
regressor.compile(optimizer='sgd', loss='mse')

# Dummy data
n = 10000
df = pd.DataFrame(
    {
      'y': np.arange(n),
      'holidays': np.random.randn(n),
      'day_of_month': np.random.randn(n),
      'day_of_week': np.random.randn(n),
      'month': np.random.randn(n),
      'quarter': np.random.randn(n),     
    }
)

# Train test split
train_df, test_df = train_test_split(df)
print (train_df.shape, test_df.shape)\

# Create y to be predicted 
# given last n days predict todays y

# train data
sequence_length = 3
y_pred = train_df['y'][sequence_length-1:].values
train_df = train_df[:-2]
train_df['y_pred'] = y_pred

# Validataion data
y_pred = test_df['y'][sequence_length-1:].values
test_df = test_df[:-2]
test_df['y_pred'] = y_pred

# Create window datagenerators

# Train data generator
train_X = train_df[['holidays','day_of_month','day_of_week','month','month']]
train_y = train_df['y_pred']
train_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    train_X, train_y, sequence_length=sequence_length, shuffle=True, batch_size=4)

# Validation data generator
test_X = test_df[['holidays','day_of_month','day_of_week','month','month']]
test_y = test_df['y_pred']
test_dataset = tf.keras.preprocessing.timeseries_dataset_from_array(
    test_X, test_y, sequence_length=sequence_length, shuffle=True, batch_size=4)

# Finally fit the model
regressor.fit(train_dataset, validation_data=test_dataset, epochs=3)

(7500, 6) (2500, 6)
Epoch 1/3
1874/1874 [==============================] - 8s 3ms/step - loss: 9974697.3664 - val_loss: 8242597.5000
Epoch 2/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8367530.7117 - val_loss: 8256667.0000
Epoch 3/3
1874/1874 [==============================] - 6s 3ms/step - loss: 8379048.3237 - val_loss: 8233981.5000
<tensorflow.python.keras.callbacks.History at 0x7f3e94bdd198>