Tensorflow 2.0（Keras）分类及限制类问题背景_Tensorflow_Keras_Tensorflow2.0_Tf.keras

Tensorflow 2.0（Keras）分类及限制类问题背景

tensorflow keras

Tensorflow 2.0（Keras）分类及限制类问题背景,tensorflow,keras,tensorflow2.0,tf.keras,Tensorflow,Keras,Tensorflow2.0,Tf.keras,我有一个基本的分类问题，将每一行分成20个类中的一个然而，有一个转折点。对于每一行，这20个类中只有一些是有效的——这是预先知道的在tensorflow 1.0中，我一直在取消不可能类的logit。唯一的修改是损失函数： def getLoss(logits, y, restrictions): logits = tf.where(restrictions, -1000.0 * tf.ones_like(y), logits) return tf.nn.softmax_cro

我有一个基本的分类问题，将每一行分成20个类中的一个

然而，有一个转折点。对于每一行，这20个类中只有一些是有效的——这是预先知道的

在tensorflow 1.0中，我一直在取消不可能类的logit。唯一的修改是损失函数：

def getLoss(logits, y, restrictions):
    logits = tf.where(restrictions, -1000.0 * tf.ones_like(y), logits)
    return tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=y)

loss = getLoss(logits, y, restrictions)
trainer = tf.train.RMSPropOptimizer(learnRate).minimize(loss)

问题: 我有一个Tensorflow 1.0的工作解决方案，它是损失函数的一个简单修正。然而，我想在Tensorflow 2.0和Keras中重写它

我假设需要将类限制矩阵与输入一起传递到

model.fit（）

。我该怎么做呢

次优解思想一个简单的解决方案（也是Frederik提出的）是将输入和类约束矩阵连接起来，让神经网络从头开始学习类约束的概念

然而，这是不可靠的，并使神经网络不必要地更大。有没有更好、更简单的方法来使用Keras？

但在推理时这是如何工作的？您知道在推断时新行的类限制吗

如果答案是“是”：

我认为不应该将整个类限制矩阵作为输入，而应该使用串联来提供类限制向量。因此，您不必使用shape

（n，）

，而是使用shape

（n+20，）

提供

行加类限制
这样，您也不需要消除任何错误，模型将根据分类损失了解应该输出什么
如果答案是“否”：
那你的模型就没什么意义了。培训数据是一组（行，类限制，类应该是）
，维度（nb\u行特征+20+20）
，对吗？你想训练什么——实际上是实际应用——你的行中有什么样的数据？如果答案是否定的，我不明白你想要什么。
假设你必须在推理时通过类限制矩阵
您可以在一个简单的Lambda
层中手动在logits上构建限制操作。然后在受限登录上应用softmax，并应用标准交叉熵损失函数
这里是一个虚拟示例，其中我们有二进制格式的类的掩码/限制
n_class = 8
n_sample = 10
X = np.random.uniform(0,1, (n_sample,30))
y = np.random.randint(0,n_class, (n_sample,))
mask = np.random.randint(0,2, (n_sample,n_class))

def mask_logits(logits, mask):
    restrictions = (mask > 0)
    return tf.keras.backend.switch(restrictions, -1000.0 * tf.ones_like(logits), logits)

inp_x = Input((X.shape[-1],))
inp_mask = Input((n_class,))
logits = Dense(n_class)(inp_x)
out = Lambda(mask_logits)(logits, inp_mask)
out = Activation('softmax')(out)
model = Model([inp_x, inp_mask], out)
model.compile('adam', 'sparse_categorical_crossentropy')

model.fit([X,mask], y, epochs=3)

在推断时，可通过以下方式检索预测：
pred = model.predict([X, mask])

最后，我们计算了一些简单的检查：
>>> pred.sum(1)
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32)

预测概率总和为1行
>>> pred == 0
array([[ True,  True, False,  True, False, False,  True, False],
       [ True,  True,  True,  True, False, False,  True,  True],
       [False, False, False,  True, False, False,  True,  True],
       [False, False,  True, False, False,  True,  True, False],
       [False, False,  True, False, False,  True,  True, False],
       [False, False,  True,  True, False, False, False, False],
       [False,  True, False,  True, False, False,  True,  True],
       [ True, False, False, False, False,  True, False,  True],
       [False,  True,  True, False, False, False,  True, False],
       [False,  True,  True, False,  True, False, False, False]])

一些预测概率等于0，如我们的二进制掩码所指定的
您的损失函数可以完全以相同的方式实现：
def getLoss(logits, y, restrictions):
    logits = tf.where(restrictions, -1000.0 * tf.ones_like(y, dtype=tf.float32), logits)
    return tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y)

然后可按如下方式定义模型：
x_input = Input(shape=(100,))
y_true = Input(shape=(20,))
restrictions = Input(shape=(20,), dtype=tf.bool)
# ... model definition here
y_pred = Dense(20)(x_input)
model = Model([x_input, restrictions, y_true], y_pred)

model.add_loss(getLoss(y_pred, y_true, restrictions))
model.compile(optimizer='rmsprop')

要编译模型，请按如下方式添加损失：
x_input = Input(shape=(100,))
y_true = Input(shape=(20,))
restrictions = Input(shape=(20,), dtype=tf.bool)
# ... model definition here
y_pred = Dense(20)(x_input)
model = Model([x_input, restrictions, y_true], y_pred)

model.add_loss(getLoss(y_pred, y_true, restrictions))
model.compile(optimizer='rmsprop')

最后，可以使用模型的fit
方法对模型进行训练。例如：
x = np.random.random((1000, 100))
restrictions = np.random.binomial(1, p=0.5, size=(1000, 20))
y = np.random.randint(20, size=1000)
y_onehot = np.eye(20)[y]
model.fit((x, restrictions, y_onehot), epochs=10, batch_size=10)

是否可以扩展该“类限制”？@RokKralj如果要在TF2中重写该范围。。。有没有办法在TF1中演示代码的实现？添加了TF1代码示例。我认为在推断时，此解决方案不会产生准确的结果。。。掩模仅涉及损耗计算。。。对于大于0且不等于0的输出概率没有任何限制，因为在输入掩码中OP特别询问损失函数。在推理时应用限制是微不足道的。你能详细说明如何在推理时应用限制吗？@RokKralj你能提供任何反馈吗