Performance caffe是否也计算学习率为零（lr_mult=0）的层在向后传球过程中的梯度？我最近用D-X-Y C++实现了一个更快的R-CNN模型。为了节省训练时间，我通过设置lr_mult=0冻结了较低（共享）卷积层。我比较了有无冻结层的迭代时间，发现没有显著差异。在Caffe中，对于lr_mult=0的这些层，是否仍计算梯度？_Performance_Machine Learning_Computer Vision_Caffe_Gradient Descent

Performance caffe是否也计算学习率为零（lr_mult=0）的层在向后传球过程中的梯度？我最近用D-X-Y C++实现了一个更快的R-CNN模型。为了节省训练时间，我通过设置lr_mult=0冻结了较低（共享）卷积层。我比较了有无冻结层的迭代时间，发现没有显著差异。在Caffe中，对于lr_mult=0的这些层，是否仍计算梯度？

performance machine-learning computer-vision

Performance caffe是否也计算学习率为零（lr_mult=0）的层在向后传球过程中的梯度？我最近用D-X-Y C++实现了一个更快的R-CNN模型。为了节省训练时间，我通过设置lr_mult=0冻结了较低（共享）卷积层。我比较了有无冻结层的迭代时间，发现没有显著差异。在Caffe中，对于lr_mult=0的这些层，是否仍计算梯度？,performance,machine-learning,computer-vision,caffe,gradient-descent,Performance,Machine Learning,Computer Vision,Caffe,Gradient Descent,我不是100%确定，但即使在lr_mult:0时，AFAIK Caffe也会计算梯度，因为在其他地方可能需要梯度。您是否尝试设置阻止渐变传播从caffe.proto： // Specifies whether to backpropagate to each bottom. If unspecified, // Caffe will automatically infer whether each input needs backpropagation // to compute

我不是100%确定，但即使在

lr_mult:0

时，AFAIK Caffe也会计算梯度，因为在其他地方可能需要梯度。
您是否尝试设置阻止渐变传播

从

caffe.proto

：

  // Specifies whether to backpropagate to each bottom. If unspecified,
  // Caffe will automatically infer whether each input needs backpropagation
  // to compute parameter gradients. If set to true for some inputs,
  // backpropagation to those inputs is forced; if set false for some inputs,
  // backpropagation to those inputs is skipped.