Python 使用分类交叉熵时出错_Python_Tensorflow_Deep Learning_Nlp

Python 使用分类交叉熵时出错

python tensorflow deep-learning nlp

Python 使用分类交叉熵时出错,python,tensorflow,deep-learning,nlp,Python,Tensorflow,Deep Learning,Nlp,我正在用tensorflow学习深度学习。我制作了一个简单的NLP代码，预测给定句子中的下一个单词 model = tf.keras.Sequential() model.add(Embedding(num,64,input_length = max_len-1)) # we subtract 1 coz we cropped the laste word from X in out data model.add(Bidirectional(LSTM(32))) model.add(Dens

我正在用tensorflow学习深度学习。我制作了一个简单的NLP代码，预测给定句子中的下一个单词

model = tf.keras.Sequential()
model.add(Embedding(num,64,input_length = max_len-1))   # we subtract 1 coz we cropped the laste word from X in out data
model.add(Bidirectional(LSTM(32)))
model.add(Dense(num,activation = 'softmax'))


model.compile(optimizer = 'adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])

history = model.fit(X,Y,epochs = 500)

然而，使用分类交叉熵给了我以下错误

ValueError: You are passing a target array of shape (453, 1) while using as loss `categorical_crossentropy`. `categorical_crossentropy` expects targets to be binary matrices (1s and 0s) of shape (samples, classes). If your targets are integer classes, you can convert them to the expected format via:
```
from keras.utils import to_categorical
y_binary = to_categorical(y_int)
```

Alternatively, you can use the loss function `sparse_categorical_crossentropy` instead, which does expect integer targets.

有人能解释一下这意味着什么以及为什么我不能使用分类交叉熵损失函数吗？非常感谢你！

任何帮助都将不胜感激

分类交叉熵用于多类分类问题。当您使用“softmax”作为激活时，输出层中的每个类将有一个节点。对于每个样本，对应于样本类的节点应接近1，其余节点应接近0。因此，真正的类标签Y需要是一个one hot编码向量
假设Y中的类标签是整数，如0,1,2，。。。请尝试下面的代码

from keras.utils import to_categorical model = tf.keras.Sequential() model.add(Embedding(num,64,input_length = max_len-1)) # we subtract 1 coz we cropped the laste word from X in out data model.add(Bidirectional(LSTM(32))) model.add(Dense(num,activation = 'softmax')) model.compile(optimizer = 'adam',loss = 'categorical_crossentropy',metrics = ['accuracy']) Y_one_hot=to_categorical(Y) # convert Y into an one-hot vector history = model.fit(X,Y_one_hot,epochs = 500) # use Y_one_hot encoding instead of Y