Keras 在给定非文本顺序数据的情况下,对于多类分类,LSTM的y列形状应该是什么? 问题描述

Keras 在给定非文本顺序数据的情况下,对于多类分类,LSTM的y列形状应该是什么? 问题描述,keras,time-series,lstm,multiclass-classification,Keras,Time Series,Lstm,Multiclass Classification,我有一个数据集(features=175,n_time_steps=954,Number of sequences=737)。 列1-174是功能,最后一个目标列包含3个不同的类。我想使用LSTM进行多类分类,仅预测最后一个时间步骤,即使用953步骤和特征预测步骤954的类别。我正在努力解决y_列车输入的结构问题。对于如何正确地改造y_列车以解决此问题,我将不胜感激 资料 我有737种产品,每种都有954天的销售。目标类是(0-当产品不存在时,1-A类产品,2-B类产品)。我需要使用953天和1

我有一个数据集(features=175,n_time_steps=954,Number of sequences=737)。 列1-174是功能,最后一个目标列包含3个不同的类。我想使用LSTM进行多类分类,仅预测最后一个时间步骤,即使用953步骤和特征预测步骤954的类别。我正在努力解决y_列车输入的结构问题。对于如何正确地改造y_列车以解决此问题,我将不胜感激

资料 我有737种产品,每种都有954天的销售。目标类是(0-当产品不存在时,1-A类产品,2-B类产品)。我需要使用953天和174个特征来预测序列最后一天(954)每个产品的类别。测试集有100个产品,列车集有637个产品

改造后的X_系列具有(637953175)形状。火车的形状是 (637, 1). 当我运行到_分类时,形状是(637,2)。当与LSTM模型相匹配时,两种y_序列形状都会产生错误

当我拟合y_列的形状(637,1)时,误差为

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.
ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)
InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] 
当我将
拟合到形状(637,2)的_categorical(y_train)
时,错误是

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.
ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)
InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] 
当我改为“稀疏-分类-交叉熵”并拟合y列形状(637,1)时,误差为

ValueError: You are passing a target array of shape (637, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:

from keras.utils import to_categorical
y_binary = to_categorical(y_int)


Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.
ValueError: Error when checking target: expected dense_45 to have shape (1,) but got array with shape (2,)
InvalidArgumentError: Received a label value of 1 which is outside the valid range of [0, 1).  Label values: 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0
     [[{{node loss_13/dense_48_loss/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] 
这是我的模型

model = Sequential([
            LSTM(units=1024, 
            input_shape=(periods_to_train,features), kernel_initializer='he_uniform',
            activation ='linear', kernel_constraint=maxnorm(3), return_sequences=False),
            Dropout(rate=0.5),
            Dense(units=1024,kernel_initializer='he_uniform', 
            activation='linear', kernel_constraint=maxnorm(3)),
            Dropout(rate=0.5),
            Dense(units=1024, kernel_initializer='he_uniform',
            activation='linear', kernel_constraint=maxnorm(3)),
            Dropout(rate=0.5),
            Dense(units=periods_to_predict, kernel_initializer='he_uniform', activation='softmax')])

        #Compile model
optimizer = Adamax(lr=0.001, decay=0.1)

model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

configure(gpu_ind=True)
model.fit(X_train, y_train ,validation_split=0.1, batch_size=100, epochs=8, shuffle=True)

看来你对网络的理解是正确的。因此,我重新创建了一个最小的工作示例,以与您相同的方式生成数据并进行训练。当我将时间步长(句点到火车)设置为953时,也会出现一些奇怪的错误。但多项研究表明,使用LSTM的时间步长依赖性不超过200到500,因为模型输出将开始“忘记”早期信息

下面是一个简单的工作示例代码,它只使用了100个时间步。我的案例中没有错误(tensorflow版本1.14.0):

将tensorflow导入为tf
将tensorflow.keras.backend作为K导入
将numpy作为np导入
数据大小=637
周期\u至\u列车=100
特征=175
周期_至_预测=3
X\u train=np.random.rand(数据大小、周期、特征)
y_train=np.random.randint(0,3,数据大小)。重塑(-1,1)
K.清除会话()
模型=tf.keras.models.Sequential([
tf.keras.layers.LSTM(
单位=1024,输入\u形状=(句点\u到\u序列,特征),内核\u初始化器='he\u uniform',
activation='linear',kernel_constraint=tf.keras.constraints.max_norm(3.),return_sequences=False),
tf.keras.layers.Dropout(速率=0.5),
tf.keras.layers.Dense(
单位=1024,内核初始化器='he\u uniform',
activation='linear',kernel_constraint=tf.keras.constraints.max_norm(3)),
tf.keras.layers.Dropout(速率=0.5),
tf.keras.layers.Dense(
单位=1024,内核初始化器='he\u uniform',
activation='linear',kernel_constraint=tf.keras.constraints.max_norm(3)),
tf.keras.layers.Dropout(速率=0.5),
tf.keras.layers.Dense(
单位=要预测的周期,内核初始化器='he\u uniform',
激活(='softmax')]))
优化器=tf.keras.optimizers.Adamax(lr=0.001,衰减=0.1)
compile(loss='sparse\u categorical\u crossentropy',optimizer=optimizer,metrics=['accurity'])
模型拟合(X序列、y序列、验证分割=0.1、批量大小=64、历代数=1、随机数=True)

看来您对网络的理解是正确的。因此,我重新创建了一个最小的工作示例,以与您相同的方式生成数据并进行训练。当我将时间步长(句点到火车)设置为953时,也会出现一些奇怪的错误。但多项研究表明,使用LSTM的时间步长依赖性不超过200到500,因为模型输出将开始“忘记”早期信息

下面是一个简单的工作示例代码,它只使用了100个时间步。我的案例中没有错误(tensorflow版本1.14.0):

将tensorflow导入为tf
将tensorflow.keras.backend作为K导入
将numpy作为np导入
数据大小=637
周期\u至\u列车=100
特征=175
周期_至_预测=3
X\u train=np.random.rand(数据大小、周期、特征)
y_train=np.random.randint(0,3,数据大小)。重塑(-1,1)
K.清除会话()
模型=tf.keras.models.Sequential([
tf.keras.layers.LSTM(
单位=1024,输入\u形状=(句点\u到\u序列,特征),内核\u初始化器='he\u uniform',
activation='linear',kernel_constraint=tf.keras.constraints.max_norm(3.),return_sequences=False),
tf.keras.layers.Dropout(速率=0.5),
tf.keras.layers.Dense(
单位=1024,内核初始化器='he\u uniform',
activation='linear',kernel_constraint=tf.keras.constraints.max_norm(3)),
tf.keras.layers.Dropout(速率=0.5),
tf.keras.layers.Dense(
单位=1024,内核初始化器='he\u uniform',
activation='linear',kernel_constraint=tf.keras.constraints.max_norm(3)),
tf.keras.layers.Dropout(速率=0.5),
tf.keras.layers.Dense(
单位=要预测的周期,内核初始化器='he\u uniform',
激活(='softmax')]))
优化器=tf.keras.optimizers.Adamax(lr=0.001,衰减=0.1)
compile(loss='sparse\u categorical\u crossentropy',optimizer=optimizer,metrics=['accurity'])
模型拟合(X序列、y序列、验证分割=0.1、批量大小=64、历代数=1、随机数=True)

谢谢你,KrisR89,我在阅读了你的文章后发现了这个概述。为了将来的参考[链接]谢谢你KrisR89,我在阅读了你的文章后发现了这个概述。供日后参考[链接]