Python 如何编写contextmanager来抛出和捕获错误_Python_Pytorch_Contextmanager

Python 如何编写contextmanager来抛出和捕获错误

python pytorch

Python 如何编写contextmanager来抛出和捕获错误,python,pytorch,contextmanager,Python,Pytorch,Contextmanager,我想在我的代码中多次捕获运行时错误CUDA内存不足。我这样做是为了以较小的批量重新运行整个培训工作流。最好的方法是什么我目前正在这样做： try: result = model(input) # if the GPU runs out of memory, start the experiment again with a smaller batch size except RuntimeError as e: if str(e).startswith('CUDA out of

我想在我的代码中多次捕获运行时错误

CUDA内存不足

。我这样做是为了以较小的批量重新运行整个培训工作流。最好的方法是什么

我目前正在这样做：

try:
    result = model(input)
# if the GPU runs out of memory, start the experiment again with a smaller batch size
except RuntimeError as e:
    if str(e).startswith('CUDA out of memory.') and batch_size > 10:
        raise CudaOutOfMemory(e)
    else:
        raise e

然后，我捕获主函数外部的错误

cudaootfmemory

然而，这是一段相当长的代码，我需要重复很多次。有什么方法可以为它创建一个上下文管理器吗

这样我就可以运行：

with catch_cuda_out_of_mem_error:
  result = model(input)

编辑：

我想创建一个上下文管理器而不是一个函数，因为我想包装“try，except”的函数并不总是相同的。在我的工作流程中，我有许多使用大量GPU内存的函数，我希望在其中任何一个函数中捕获此错误。

使用上下文管理器就是正确获取和释放资源。在这里，您实际上没有获取和发布的任何资源，因此我认为上下文管理器不合适。只使用一个函数怎么样

def try_compute_model(input):
    try:
        return model(input)
    # if the GPU runs out of memory, start the experiment again with a smaller batch size
    except RuntimeError as e:
        if str(e).startswith('CUDA out of memory.') and batch_size > 10:
            raise CudaOutOfMemory(e)
        else:
            raise e

然后像这样使用它

result = try_compute_model(input)

受此帖子启发：我找到了问题的答案：

import torch
from contextlib import contextmanager


class CudaOutOfMemory(Exception):
    pass


@contextmanager
def catching_cuda_out_of_memory():
    """
    Context that throws CudaOutOfMemory error if GPU is out of memory.
    """
    try:
        yield
    except RuntimeError as e:
        if str(e).startswith('CUDA out of memory.'):
            raise CudaOutOfMemory(e)
        else:
            raise e


def oom():
    x = torch.randn(100, 10000, device=1)
    for _ in range(100):
        l = torch.nn.Linear(10000, 10000)
        l.to(1)
        x = l(x)


try:
    with catching_cuda_out_of_memory():
        oom()
except CudaOutOfMemory:
    print('GOTCHA!')

Python开发人员没有那么严格。例如，

contexlib.suppress

管理器不获取/发布资源。您好@mCoding感谢您的回复！我想创建一个contextmanager而不是一个函数，因为我想要包装“try，except”的函数并不总是相同的。在我的工作流程中，我有许多使用大量GPU内存的函数，我希望在其中任何一个函数中捕捉到这个错误。对不起，如果我的问题在这个意义上有误导性