Keras 带分类的LSTM_Keras_Deep Learning_Lstm_Recurrent Neural Network

Keras 带分类的LSTM

keras deep-learning

Keras 带分类的LSTM,keras,deep-learning,lstm,recurrent-neural-network,Keras,Deep Learning,Lstm,Recurrent Neural Network,是否可以将LSTM与我已分类的单词数组一起使用例如，我有一个包含1000个单词的数组： “绿色” “蓝色” “红色” “黄色” 我把这些词分类为绿色=0，蓝色=1，红色=2，黄色=3 我想预测第四个词。这些词在顺序上可以有不同的顺序。例如，第一个序列可以是输入=绿色、蓝色、红色、目标=黄色，下一个序列是输入=蓝色、红色、黄色、目标=绿色，依此类推也许我不应该使用LSTM，但我想我应该，因为我想检查3个早期输入并预测第4个这就是我到目前为止所做的，我或多或少地被我的词表的重塑所困扰。我真的不

是否可以将LSTM与我已分类的单词数组一起使用

例如，我有一个包含1000个单词的数组：

“绿色” “蓝色” “红色” “黄色”

我把这些词分类为绿色=0，蓝色=1，红色=2，黄色=3

我想预测第四个词。这些词在顺序上可以有不同的顺序。例如，第一个序列可以是输入=绿色、蓝色、红色、目标=黄色，下一个序列是输入=蓝色、红色、黄色、目标=绿色，依此类推

也许我不应该使用LSTM，但我想我应该，因为我想检查3个早期输入并预测第4个

这就是我到目前为止所做的，我或多或少地被我的词表的重塑所困扰。我真的不明白我应该有什么样的输入。我猜是时间步长=3，功能=4

# define documents
words = [0,1,2,3,2,3,1,0,0,1,2,3,2,0,3,1,1,2,3,0]

words_cat = to_categorical(words,4)

X_train = ?
y_train = ?

# define the model
model = Sequential()
model.add(LSTM(32, input_shape=(3,4)))
model.add(Dense(4, activation='softmax'))

# compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# summarize the model
print(model.summary())

# fit the model
model.fit(X_train, y_train epochs=50, verbose=0)

正如第一条评论中已经提到的，在这种情况下，LSTM网络可能有点过头了。但我想你这样做是出于教育学的原因

下面是一个工作示例：

# define documents
words = [0,1,2,3,2,3,1,0,0,1,2,3,2,0,3,1,1,2,3,0]
# create labels
labels = np.roll(words[:-3], -3)

X_train = np.array([words[i:(i+3)%len(words)] for i in range(len(words)-3)]).reshape(-1,1,3)
y_train = labels

# define the model
model = Sequential()
model.add(LSTM(32, input_shape=(None,3)))
model.add(Dense(4, activation='softmax'))

# compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# summarize the model
print(model.summary())

# fit the model
model.fit(X_train, y_train, epochs=5, batch_size=1, verbose=1)
preds = model.predict(X_train).argmax(1)
print(preds)
print(y_train)

输出：

Epoch 1/5
17/17 [==============================] - 2s 88ms/step - loss: 1.3771 - accuracy: 0.1765
Epoch 2/5
17/17 [==============================] - 0s 9ms/step - loss: 1.3647 - accuracy: 0.3529
Epoch 3/5
17/17 [==============================] - 0s 6ms/step - loss: 1.3568 - accuracy: 0.2353
Epoch 4/5
17/17 [==============================] - 0s 8ms/step - loss: 1.3496 - accuracy: 0.2353
Epoch 5/5
17/17 [==============================] - 0s 7ms/step - loss: 1.3420 - accuracy: 0.4118
[1 2 1 2 0 0 0 1 1 2 1 0 2 1 1 1 2]
[3 2 3 1 0 0 1 2 3 2 0 3 1 1 0 1 2]

所以我接受了你提供的词语，并对它们进行了重塑。前三个条目是要训练的系列，第四个条目是标签

如果序列是随机的，那么模型很难预测下一个值。否则，您可能需要训练更长的时间或提供更多的示例（不过，在这种情况下，组合的数量相当有限）。

正如第一条评论中已经提到的，在这种情况下，LSTM网络可能有点过头了。但我想你这样做是出于教育学的原因

下面是一个工作示例：

# define documents
words = [0,1,2,3,2,3,1,0,0,1,2,3,2,0,3,1,1,2,3,0]
# create labels
labels = np.roll(words[:-3], -3)

X_train = np.array([words[i:(i+3)%len(words)] for i in range(len(words)-3)]).reshape(-1,1,3)
y_train = labels

# define the model
model = Sequential()
model.add(LSTM(32, input_shape=(None,3)))
model.add(Dense(4, activation='softmax'))

# compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# summarize the model
print(model.summary())

# fit the model
model.fit(X_train, y_train, epochs=5, batch_size=1, verbose=1)
preds = model.predict(X_train).argmax(1)
print(preds)
print(y_train)

输出：

Epoch 1/5
17/17 [==============================] - 2s 88ms/step - loss: 1.3771 - accuracy: 0.1765
Epoch 2/5
17/17 [==============================] - 0s 9ms/step - loss: 1.3647 - accuracy: 0.3529
Epoch 3/5
17/17 [==============================] - 0s 6ms/step - loss: 1.3568 - accuracy: 0.2353
Epoch 4/5
17/17 [==============================] - 0s 8ms/step - loss: 1.3496 - accuracy: 0.2353
Epoch 5/5
17/17 [==============================] - 0s 7ms/step - loss: 1.3420 - accuracy: 0.4118
[1 2 1 2 0 0 0 1 1 2 1 0 2 1 1 1 2]
[3 2 3 1 0 0 1 2 3 2 0 3 1 1 0 1 2]

所以我接受了你提供的词语，并对它们进行了重塑。前三个条目是要训练的系列，第四个条目是标签

如果序列是随机的，那么模型很难预测下一个值。否则，您可能需要训练更长的时间或提供更多的示例（但本例中的组合数量相当有限）。

我建议将序列重新打包为4个项目（3+1个traget）的集合，并使用NN（带有一个热编码）或任何其他更简单的工具运行多元逻辑回归。LSTM在这里是一种过度的方法。我建议将序列重新打包成4个项目（3+1个traget）的集合，并使用NN（带有一个热编码）或任何其他更简单的工具运行多元逻辑回归。LSTM在这里是一种过度的杀伤力。