Python 如何修改填充向量的seq2seq成本函数?

Python 如何修改填充向量的seq2seq成本函数?,python,dynamic,tensorflow,deep-learning,lstm,Python,Dynamic,Tensorflow,Deep Learning,Lstm,Tensorflow通过在构建RNN层时使用参数“sequence_length”来支持动态长度序列,其中模型在序列大小=“sequence_length”即返回零向量后不学习序列 然而,如何修改at的代价函数以遇到屏蔽序列,从而只对实际序列而不是整个填充序列计算代价和复杂度 def sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True, softmax_loss_function=No

Tensorflow通过在构建RNN层时使用参数“sequence_length”来支持动态长度序列,其中模型在序列大小=“sequence_length”即返回零向量后不学习序列

然而,如何修改at的代价函数以遇到屏蔽序列,从而只对实际序列而不是整个填充序列计算代价和复杂度

def sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True,  softmax_loss_function=None, name=None):

    if len(targets) != len(logits) or len(weights) != len(logits):
        raise ValueError("Lengths of logits, weights, and targets must be the same "
                         "%d, %d, %d." % (len(logits), len(weights), len(targets)))
      with ops.op_scope(logits + targets + weights, name,
                        "sequence_loss_by_example"):
        log_perp_list = []
        for logit, target, weight in zip(logits, targets, weights):
          if softmax_loss_function is None:
            # TODO(irving,ebrevdo): This reshape is needed because
            # sequence_loss_by_example is called with scalars sometimes, which
            # violates our general scalar strictness policy.
            target = array_ops.reshape(target, [-1])
            crossent = nn_ops.sparse_softmax_cross_entropy_with_logits(
                logit, target)
          else:
            crossent = softmax_loss_function(logit, target)
          log_perp_list.append(crossent * weight)
        log_perps = math_ops.add_n(log_perp_list)
        if average_across_timesteps:
          total_size = math_ops.add_n(weights)
          total_size += 1e-12  # Just to avoid division by 0 for all-0 weights.
          log_perps /= total_size
    return log_perps

此函数已经支持通过使用权重计算动态序列长度的成本。只要确保“填充目标”的权重为0,这些步骤的交叉熵将被推到0:

log_perp_list.append(crossent * weight)
总大小也将仅反映非填充步骤:

total_size = math_ops.add_n(weights)
如果使用零填充,则导出权重的一种方法如下所示:

weights = tf.sign(tf.abs(model.targets))

(请注意,您可能需要将其转换为与目标相同的类型)

我只是这样做了。然而,当我在句子层面使用ptb模型()进行序列学习时,我得到了不同的代价和困惑。这就是为什么,我在想我们是否需要定制任何东西。到目前为止,模型学习所有训练数据的连续序列。我只是把它改成在句子层次上学习序列,因此所有句子都加0,使它们等于最大长度。好的!知道了。我错过了更新一些参数,如iter和epoch,以便进行最终计算。现在它运行良好。谢谢