Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在tensorflow中累积梯度?_Tensorflow_Conv Neural Network_Gradient Descent - Fatal编程技术网

如何在tensorflow中累积梯度?

如何在tensorflow中累积梯度?,tensorflow,conv-neural-network,gradient-descent,Tensorflow,Conv Neural Network,Gradient Descent,我有一个类似的问题 因为我的资源有限,而且我使用的是一个深度模型(VGG-16)——用于训练一个三重网络——我想为一个训练示例大小的128批累积梯度,然后传播错误并更新权重 我不清楚该怎么做。我使用tensorflow,但欢迎使用任何实现/伪代码 让我们浏览一下您喜欢的答案之一中提出的代码: ## Optimizer definition - nothing different from any classical example opt = tf.train.AdamOptimizer()

我有一个类似的问题

因为我的资源有限,而且我使用的是一个深度模型(VGG-16)——用于训练一个三重网络——我想为一个训练示例大小的128批累积梯度,然后传播错误并更新权重


我不清楚该怎么做。我使用tensorflow,但欢迎使用任何实现/伪代码

让我们浏览一下您喜欢的答案之一中提出的代码:

## Optimizer definition - nothing different from any classical example
opt = tf.train.AdamOptimizer()

## Retrieve all trainable variables you defined in your graph
tvs = tf.trainable_variables()
## Creation of a list of variables with the same shape as the trainable ones
# initialized with 0s
accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]
zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]

## Calls the compute_gradients function of the optimizer to obtain... the list of gradients
gvs = opt.compute_gradients(rmse, tvs)

## Adds to each element from the list you initialized earlier with zeros its gradient (works because accum_vars and gvs are in the same order)
accum_ops = [accum_vars[i].assign_add(gv[0]) for i, gv in enumerate(gvs)]

## Define the training step (part with variable value update)
train_step = opt.apply_gradients([(accum_vars[i], gv[1]) for i, gv in enumerate(gvs)])
第一部分主要是在图形中添加新的
变量
ops
,这将允许您

  • 使用ops
    accum\u ops
    in(变量列表)累积梯度
    accum\u vars
  • 使用ops
    train\u步骤更新模型重量
  • 然后,要在培训时使用它,您必须遵循以下步骤(仍然来自您链接的答案):


    Tensorflow 2.0兼容答案:与上文提到的Pop答案和中提供的解释一致,下面提到的是Tensorflow 2.0版中累积梯度的代码:

    def train(epochs):
      for epoch in range(epochs):
        for (batch, (images, labels)) in enumerate(dataset):
           with tf.GradientTape() as tape:
            logits = mnist_model(images, training=True)
            tvs = mnist_model.trainable_variables
            accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]
            zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]
            loss_value = loss_object(labels, logits)
    
           loss_history.append(loss_value.numpy().mean())
           grads = tape.gradient(loss_value, tvs)
           #print(grads[0].shape)
           #print(accum_vars[0].shape)
           accum_ops = [accum_vars[i].assign_add(grad) for i, grad in enumerate(grads)]
    
    
    
        optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))
        print ('Epoch {} finished'.format(epoch))
    
    # call the above function    
    train(epochs = 3)
    

    完整的代码可以在这里找到。

    为什么不使用您链接的问题的答案?@Pop,因为我不理解它们。我在寻找更详细的东西(初学者级别),所以您将
    sess.run(train\u step)
    放在循环之外。这意味着重量更新将在计算最后一批的梯度后发生,对吗?如果我们把它放在循环中,它会在每个历元之后发生,对吗?应该是
    优化器。应用梯度(zip(accum\u ops,mnist\u model.trainable\u variables))
    ?我也无法在tf.function中创建tf.Variable,有什么建议吗?我在遵循这段代码时也遇到了问题,我发布了一个工作版本的链接问题;
    def train(epochs):
      for epoch in range(epochs):
        for (batch, (images, labels)) in enumerate(dataset):
           with tf.GradientTape() as tape:
            logits = mnist_model(images, training=True)
            tvs = mnist_model.trainable_variables
            accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]
            zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]
            loss_value = loss_object(labels, logits)
    
           loss_history.append(loss_value.numpy().mean())
           grads = tape.gradient(loss_value, tvs)
           #print(grads[0].shape)
           #print(accum_vars[0].shape)
           accum_ops = [accum_vars[i].assign_add(grad) for i, grad in enumerate(grads)]
    
    
    
        optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))
        print ('Epoch {} finished'.format(epoch))
    
    # call the above function    
    train(epochs = 3)