Python Pytorch模型训练不使用正向_Python_Deep Learning_Pytorch

Python Pytorch模型训练不使用正向

python deep-learning pytorch

Python Pytorch模型训练不使用正向,python,deep-learning,pytorch,Python,Deep Learning,Pytorch,我正在制作训练剪辑模型。下面是模型的源代码剪辑对象基本上是这样构造的： class CLIP(nn.module): ... def encode_image(self, image): return self.visual(image.type(self.dtype)) def encode_text(self, text): x = ... ... return x def forward(self, image, text

我正在制作训练剪辑模型。下面是模型的源代码

剪辑对象基本上是这样构造的：

class CLIP(nn.module):
   ...
   def encode_image(self, image):
     return self.visual(image.type(self.dtype))

   def encode_text(self, text):
    x = ... 
    ...
    return x

   def forward(self, image, text):
     image_features = self.encode_image(image)
     text_features = self.encode_text(text)
     ...
     return logits_per_image, logits_per_text

for k in range(epoch):
  for batch in dataloader :
    x,y = batch
    y1 = model.encode_text(x[first_text_part])
    y2 = model.encode_text(x[second_text_part])
    <calculate loss, backward, step, etc>

除了图像和文本对之外的转发方法，因为我想为其他任务（文本-文本对）重新调整剪辑的用途，所以我没有使用“从剪辑转发”，而是使用剪辑中定义的其他方法。我的培训代码如下所示：

class CLIP(nn.module):
   ...
   def encode_image(self, image):
     return self.visual(image.type(self.dtype))

   def encode_text(self, text):
    x = ... 
    ...
    return x

   def forward(self, image, text):
     image_features = self.encode_image(image)
     text_features = self.encode_text(text)
     ...
     return logits_per_image, logits_per_text

for k in range(epoch):
  for batch in dataloader :
    x,y = batch
    y1 = model.encode_text(x[first_text_part])
    y2 = model.encode_text(x[second_text_part])
    <calculate loss, backward, step, etc>

范围内k的

（历元）：
对于dataloader中的批处理：
x、 y=批次
y1=模型。编码文本（x[第一个文本部分]）
y2=模型。编码文本（x[第二个文本部分]）

问题是，在1个历元之后，所有梯度都变成了nan，即使损失不是nan。
我怀疑Pytork只能通过向前方法传播梯度。
一些消息来源说forward没有那么特别（），但另一些消息来源说，使用torch进行编码必须使用forward（）

问题是，我们能在不使用转发方法的情况下训练Pytork网络吗？

Pytork中的

forward（）

没有什么新的内容。它只会在调用时附加网络图。反向传播不太依赖forward（），因为梯度是通过图形传播的

唯一的区别是，在pytorch源代码中，forward类似于调用（）方法，所有钩子都注册在nn.Module中