Keras 在给定非文本顺序数据的情况下，对于多类分类，LSTM的y列形状应该是什么？问题描述_Keras_Time Series_Lstm_Multiclass Classification

Keras 在给定非文本顺序数据的情况下，对于多类分类，LSTM的y列形状应该是什么？问题描述

keras

Keras 在给定非文本顺序数据的情况下，对于多类分类，LSTM的y列形状应该是什么？问题描述,keras,time-series,lstm,multiclass-classification,Keras,Time Series,Lstm,Multiclass Classification,我有一个数据集（features=175，n_time_steps=954，Number of sequences=737）。列1-174是功能，最后一个目标列包含3个不同的类。我想使用LSTM进行多类分类，仅预测最后一个时间步骤，即使用953步骤和特征预测步骤954的类别。我正在努力解决y_列车输入的结构问题。对于如何正确地改造y_列车以解决此问题，我将不胜感激资料我有737种产品，每种都有954天的销售。目标类是（0-当产品不存在时，1-A类产品，2-B类产品）。我需要使用953天和1

我有一个数据集（features=175，n_time_steps=954，Number of sequences=737）。列1-174是功能，最后一个目标列包含3个不同的类。我想使用LSTM进行多类分类，仅预测最后一个时间步骤，即使用953步骤和特征预测步骤954的类别。我正在努力解决y_列车输入的结构问题。对于如何正确地改造y_列车以解决此问题，我将不胜感激

资料我有737种产品，每种都有954天的销售。目标类是（0-当产品不存在时，1-A类产品，2-B类产品）。我需要使用953天和174个特征来预测序列最后一天（954）每个产品的类别。测试集有100个产品，列车集有637个产品

改造后的X_系列具有（637953175）形状。火车的形状是 (637, 1). 当我运行到_分类时，形状是（637,2）。当与LSTM模型相匹配时，两种y_序列形状都会产生错误

当我拟合y_列的形状（637，1）时，误差为

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)

InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]

当我将

拟合到形状（637,2）的_categorical（y_train）

时，错误是

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)

InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]

当我改为“稀疏-分类-交叉熵”并拟合y列形状（637,1）时，误差为

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)

InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]]

这是我的模型

model = Sequential([
            LSTM(units=1024, 
            input_shape=(periods_to_train,features), kernel_initializer='he_uniform',
            activation ='linear', kernel_constraint=maxnorm(3), return_sequences=False),
            Dropout(rate=0.5),
            Dense(units=1024,kernel_initializer='he_uniform', 
            activation='linear', kernel_constraint=maxnorm(3)),
            Dropout(rate=0.5),
            Dense(units=1024, kernel_initializer='he_uniform',
            activation='linear', kernel_constraint=maxnorm(3)),
            Dropout(rate=0.5),
            Dense(units=periods_to_predict, kernel_initializer='he_uniform', activation='softmax')])

        #Compile model
optimizer = Adamax(lr=0.001, decay=0.1)

model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

configure(gpu_ind=True)
model.fit(X_train, y_train ,validation_split=0.1, batch_size=100, epochs=8, shuffle=True)

看来你对网络的理解是正确的。因此，我重新创建了一个最小的工作示例，以与您相同的方式生成数据并进行训练。当我将时间步长（句点到火车）设置为953时，也会出现一些奇怪的错误。但多项研究表明，使用LSTM的时间步长依赖性不超过200到500，因为模型输出将开始“忘记”早期信息

下面是一个简单的工作示例代码，它只使用了100个时间步。我的案例中没有错误（tensorflow版本1.14.0）：

将tensorflow导入为tf
将tensorflow.keras.backend作为K导入
将numpy作为np导入
数据大小=637
周期\u至\u列车=100
特征=175
周期_至_预测=3
X\u train=np.random.rand（数据大小、周期、特征）
y_train=np.random.randint（0,3，数据大小）。重塑（-1,1）
K.清除会话（）
模型=tf.keras.models.Sequential([
tf.keras.layers.LSTM(
单位=1024，输入\u形状=（句点\u到\u序列，特征），内核\u初始化器='he\u uniform'，
activation='linear'，kernel_constraint=tf.keras.constraints.max_norm（3.），return_sequences=False），
tf.keras.layers.Dropout（速率=0.5），
tf.keras.layers.Dense(
单位=1024，内核初始化器='he\u uniform'，
activation='linear'，kernel_constraint=tf.keras.constraints.max_norm（3）），
tf.keras.layers.Dropout（速率=0.5），
tf.keras.layers.Dense(
单位=1024，内核初始化器='he\u uniform'，
activation='linear'，kernel_constraint=tf.keras.constraints.max_norm（3）），
tf.keras.layers.Dropout（速率=0.5），
tf.keras.layers.Dense(
单位=要预测的周期，内核初始化器='he\u uniform'，
激活（='softmax'）]））
优化器=tf.keras.optimizers.Adamax（lr=0.001，衰减=0.1）
compile（loss='sparse\u categorical\u crossentropy'，optimizer=optimizer，metrics=['accurity']）
模型拟合（X序列、y序列、验证分割=0.1、批量大小=64、历代数=1、随机数=True）

看来您对网络的理解是正确的。因此，我重新创建了一个最小的工作示例，以与您相同的方式生成数据并进行训练。当我将时间步长（句点到火车）设置为953时，也会出现一些奇怪的错误。但多项研究表明，使用LSTM的时间步长依赖性不超过200到500，因为模型输出将开始“忘记”早期信息

下面是一个简单的工作示例代码，它只使用了100个时间步。我的案例中没有错误（tensorflow版本1.14.0）：

将tensorflow导入为tf
将tensorflow.keras.backend作为K导入
将numpy作为np导入
数据大小=637
周期\u至\u列车=100
特征=175
周期_至_预测=3
X\u train=np.random.rand（数据大小、周期、特征）
y_train=np.random.randint（0,3，数据大小）。重塑（-1,1）
K.清除会话（）
模型=tf.keras.models.Sequential([
tf.keras.layers.LSTM(
单位=1024，输入\u形状=（句点\u到\u序列，特征），内核\u初始化器='he\u uniform'，
activation='linear'，kernel_constraint=tf.keras.constraints.max_norm（3.），return_sequences=False），
tf.keras.layers.Dropout（速率=0.5），
tf.keras.layers.Dense(
单位=1024，内核初始化器='he\u uniform'，
activation='linear'，kernel_constraint=tf.keras.constraints.max_norm（3）），
tf.keras.layers.Dropout（速率=0.5），
tf.keras.layers.Dense(
单位=1024，内核初始化器='he\u uniform'，
activation='linear'，kernel_constraint=tf.keras.constraints.max_norm（3）），
tf.keras.layers.Dropout（速率=0.5），
tf.keras.layers.Dense(
单位=要预测的周期，内核初始化器='he\u uniform'，
激活（='softmax'）]））
优化器=tf.keras.optimizers.Adamax（lr=0.001，衰减=0.1）
compile（loss='sparse\u categorical\u crossentropy'，optimizer=optimizer，metrics=['accurity']）
模型拟合（X序列、y序列、验证分割=0.1、批量大小=64、历代数=1、随机数=True）

谢谢你，KrisR89，我在阅读了你的文章后发现了这个概述。为了将来的参考[链接]谢谢你KrisR89，我在阅读了你的文章后发现了这个概述。供日后参考[链接]