Python Tensorflow：从类标签创建y索引_Python_Numpy_Tensorflow_Keras_Deep Learning

Python Tensorflow：从类标签创建y索引

python numpy tensorflow keras deep-learning

Python Tensorflow：从类标签创建y索引,python,numpy,tensorflow,keras,deep-learning,Python,Numpy,Tensorflow,Keras,Deep Learning,我有以下类别标签： y = ["class1", "class2", "class3"] 为了在模型中使用它们，我想用keras和/或tensorflow2.0的方法将这些类转换为y_索引1，2 我目前正在做的是： tokenizer = tf.keras.preprocessing.text.Tokenizer() tokenizer.fit_on_texts(y) y_train = tokenizer.texts_to_sequ

我有以下类别标签：

y = ["class1", "class2", "class3"]

为了在模型中使用它们，我想用keras和/或tensorflow2.0的方法将这些类转换为y_索引1，2

我目前正在做的是：

tokenizer = tf.keras.preprocessing.text.Tokenizer()
tokenizer.fit_on_texts(y)
y_train = tokenizer.texts_to_sequences(y)

我知道标记器在这里被误用了。将类标签转换为索引是否有更好、更小的解决方案？

谢谢。

您不能为此使用标记器，因为标记器索引从1开始，而不是从0开始。您可以使用

tf。其中

：

import tensorflow as tf

y = ['class3', 'class1', 'class1', 'class2', 'class3', 'class1', 'class2']

names = ["class1", "class2", "class3"]

labeler = lambda x: tf.where(tf.equal(x, names))

dataset = tf.data.Dataset.from_tensor_slices(y).map(labeler)

next(iter(dataset))

如前所述，您的实现从1开始索引：

[[2], [1], [1], [3], [2], [1], [3]]

当它测量损失和指标时，会使Keras崩溃。它将返回

nan

，因为最后会有三个神经元，但目标从第二个索引到第四个索引tl；dr不要在Keras中使用从1开始的索引。

我在一个小案例中使用了4个输出神经元。然而，这是一个非常有用的信息。谢谢你的回答！

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
    
le.fit_transform(y)

array([2, 0, 0, 1, 2, 0, 1], dtype=int64)

[[2], [1], [1], [3], [2], [1], [3]]