Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/google-cloud-platform/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Deep learning 运行时错误:只能在训练模式下调用cudnn RNN backward_Deep Learning_Pytorch_Recurrent Neural Network_Cudnn - Fatal编程技术网

Deep learning 运行时错误:只能在训练模式下调用cudnn RNN backward

Deep learning 运行时错误:只能在训练模式下调用cudnn RNN backward,deep-learning,pytorch,recurrent-neural-network,cudnn,Deep Learning,Pytorch,Recurrent Neural Network,Cudnn,我第一次看到这个问题,在以前的Python项目中从未遇到过这样的错误。这是我的培训代码: def train(net, opt, criterion,ucf_train, batchsize,i): opt.zero_grad() total_loss = 0 net=net.eval() net=net.train() for vid in range(i*batchsize,i*batchsize+batchsize,1):

我第一次看到这个问题,在以前的Python项目中从未遇到过这样的错误。这是我的培训代码:

def train(net, opt, criterion,ucf_train, batchsize,i):
    opt.zero_grad()
    total_loss = 0
    net=net.eval()
    net=net.train()
    for vid in range(i*batchsize,i*batchsize+batchsize,1):
    
        output=infer(net,ucf_train[vid])
        m=get_label_no(ucf_train[vid])
        m=m.cuda( )
        loss = criterion(output,m)
        loss.backward(retain_graph=True)
        total_loss += loss 
        opt.step()       #updates wghts and biases

    return total_loss/n_points
推断代码(网络,输入)


您应该删除
def-infer(net,name):

需要删除它,因为您在培训代码中调用了此推断函数。在整个培训过程中,您的模型需要处于训练模式

在调用eval之后,您也从未将模型设置回训练状态,因此这是您遇到的异常的根源。如果您想在测试用例中使用此推断代码,可以使用If覆盖该用例

另外,紧跟在
total_loss=0
赋值之后的
net.eval()
也没有用,因为您在该赋值之后调用了
net.train()
。你们也可以移除它,因为它在下一行被中和

更新的代码

推断代码(网络、输入)

def推断(网络,名称,is_train=True):
如果不是列车:
净增值()
hidden_0=net.init_hidden()
hidden_1=net.init_hidden()
hidden_2=net.init_hidden()
视频路径=获取视频(名称)
cap=cv2.视频捕获(视频路径)
调整大小=(224)
T=帧捕获(视频路径)
打印(T)
lim=T-(T%20)-2
i=0
而(一):
ret,frame2=cap.read()
frame2=cv2.resize(frame2,resize)
#打印(类型(框架2))
如果(i%20==0且i=lim):
打破
i=i+1
op=输出
torch.cuda.empty_cache()
op=op.cuda()
返回操作
def infer(net, name):
    net.eval()
    hidden_0 = net.init_hidden()
    hidden_1 = net.init_hidden()
    hidden_2 = net.init_hidden()
    video_path = fetch_ucf_video(name)
    cap = cv2.VideoCapture(video_path)
    resize=(224,224)
    T=FrameCapture(video_path)
    print(T)
    lim=T-(T%20)-2
    i=0
    while(1):
      ret, frame2 = cap.read()
      frame2= cv2.resize(frame2, resize)
    #  print(type(frame2))
      if (i%20==0 and i<lim):
          input=normalize(frame2)     
          input=input.cuda()       
          output,hidden_0,hidden_1, hidden_2  = net(input, hidden_0, hidden_1, hidden_2)
      elif (i>=lim):
          break
      i=i+1 
    op=output  
    torch.cuda.empty_cache() 
    op=op.cuda() 
    return op 
 RuntimeError                              Traceback (most recent call last)
<ipython-input-62-42238f3f6877> in <module>()
----> 1 train(net1,opt,criterion,ucf_train,1,0)

2 frames
/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128 
    129 

RuntimeError: cudnn RNN backward can only be called in training mode
def train(net, opt, criterion,ucf_train, batchsize,i):
    opt.zero_grad()
    total_loss = 0
    net=net.train()
    for vid in range(i*batchsize,i*batchsize+batchsize,1):
        output=infer(net,ucf_train[vid])
        m=get_label_no(ucf_train[vid])
        m=m.cuda( )
        loss = criterion(output,m)
        loss.backward(retain_graph=True)
        total_loss += loss 
        opt.step()       #updates wghts and biases

    return total_loss/n_points
def infer(net, name, is_train=True):
    if not is_train:
        net.eval()
    hidden_0 = net.init_hidden()
    hidden_1 = net.init_hidden()
    hidden_2 = net.init_hidden()
    video_path = fetch_ucf_video(name)
    cap = cv2.VideoCapture(video_path)
    resize=(224,224)
    T=FrameCapture(video_path)
    print(T)
    lim=T-(T%20)-2
    i=0
    while(1):
      ret, frame2 = cap.read()
      frame2= cv2.resize(frame2, resize)
      #  print(type(frame2))
      if (i%20==0 and i<lim):
          input=normalize(frame2)     
          input=input.cuda()       
          output,hidden_0,hidden_1, hidden_2  = net(input, hidden_0, hidden_1, hidden_2)
      elif (i>=lim):
          break
      i=i+1 
    op=output  
    torch.cuda.empty_cache() 
    op=op.cuda() 
    return op