如何运行tensorflow gpu

如何运行tensorflow gpu,tensorflow,neural-network,anaconda,gpu,Tensorflow,Neural Network,Anaconda,Gpu,我有一个问题,我的jupyter笔记本没有在gpu上运行。我更新了我的驱动程序(Nvidia GTX 1660 Ti),安装了CUDA 11,将CuDNN文件放入文件夹,并将正确的路径放入环境变量中。 完成后,我向Anaconda添加了一个新环境,包括一个GPU内核,并安装了tensorflow GPU(版本2.4,因为CUDA 11需要版本>=2.4.0),如本文所述 之后,我打开了带有新内核的jupyter笔记本。因此,我可以运行我的代码,直到某个步骤开始工作,但我在任务管理器中的GPU利用

我有一个问题,我的jupyter笔记本没有在gpu上运行。我更新了我的驱动程序(Nvidia GTX 1660 Ti),安装了CUDA 11,将CuDNN文件放入文件夹,并将正确的路径放入环境变量中。 完成后,我向Anaconda添加了一个新环境,包括一个GPU内核,并安装了tensorflow GPU(版本2.4,因为CUDA 11需要版本>=2.4.0),如本文所述

之后,我打开了带有新内核的jupyter笔记本。因此,我可以运行我的代码,直到某个步骤开始工作,但我在任务管理器中的GPU利用率低于1%,RAM处于60%-99%。所以我想,我的代码没有在GPU上运行。我做了一些测试:

import tensorflow.keras
import tensorflow as tf

print(tf.__version__)
print(tensorflow.keras.__version__)

print(tf.test.is_built_with_cuda())
print(tf.config.list_physical_devices('GPU'))
print(tf.test.is_gpu_available())
导致(我认为是正确的):

下一个测试是:

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
什么导致:

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 9334837591848971536
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 4837251481
locality {
  bus_id: 1
  links {
  }
}
incarnation: 2660164806064353779
physical_device_desc: "device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5"
]
所以内核中有CPU和GPU,不是吗

我能做什么,我的神经网络运行在GPU上而不是CPU上

我的代码一直在运行,直到我尝试训练我的神经网络。这是发生的代码和错误:

model.fit([np.asarray(X_train).astype(np.float32), np.asarray(X_train_zusatz).astype(np.float32)], 
          y_train, epochs=10, batch_size=10)
这是一个串联的神经网络,如果你想知道输入,它与正常的tensorflow(不是tensorflow gpu)工作得很好。但是训练需要很长时间

Epoch 1/10
---------------------------------------------------------------------------
ResourceExhaustedError                    Traceback (most recent call last)
<ipython-input-27-10813edc74c8> in <module>
      3 
      4 model.fit([np.asarray(X_train).astype(np.float32), np.asarray(X_train_zusatz).astype(np.float32)], 
----> 5           y_train, epochs=10, batch_size=10)#, 
      6           #validation_data=[[X_test, X_test_zusatz], y_test], class_weight=class_weight)

~\.conda\envs\tf-gpu\lib\site-pac

kages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
       1098                 _r=1):
       1099               callbacks.on_train_batch_begin(step)
    -> 1100               tmp_logs = self.train_function(iterator)
       1101               if data_handler.should_sync:
       1102                 context.async_wait()
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds)
        826     tracing_count = self.experimental_get_tracing_count()
        827     with trace.Trace(self._name) as tm:
    --> 828       result = self._call(*args, **kwds)
        829       compiler = "xla" if self._experimental_compile else "nonXla"
        830       new_tracing_count = self.experimental_get_tracing_count()
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds)
        886         # Lifting succeeded, so variables are initialized and we can run the
        887         # stateless function.
    --> 888         return self._stateless_fn(*args, **kwds)
        889     else:
        890       _, _, _, filtered_flat_args = \
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\function.py in __call__(self, *args, **kwargs)
       2941        filtered_flat_args) = self._maybe_define_function(args, kwargs)
       2942     return graph_function._call_flat(
    -> 2943         filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
       2944 
       2945   @property
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
       1917       # No tape is watching; skip to running the function.
       1918       return self._build_call_outputs(self._inference_function.call(
    -> 1919           ctx, args, cancellation_manager=cancellation_manager))
       1920     forward_backward = self._select_forward_and_backward_functions(
       1921         args,
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\function.py in call(self, ctx, args, cancellation_manager)
        558               inputs=args,
        559               attrs=attrs,
    --> 560               ctx=ctx)
        561         else:
        562           outputs = execute.execute_with_cancellation(
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
         58     ctx.ensure_initialized()
         59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
    ---> 60                                         inputs, attrs, num_outputs)
         61   except core._NotOkStatusException as e:
         62     if name is not None:
    
    ResourceExhaustedError: 2 root error(s) found.
      (0) Resource exhausted:  OOM when allocating tensor with shape[300,300] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node model/lstm/while/body/_1/model/lstm/while/lstm_cell/split}}]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
         [[gradient_tape/model/embedding/embedding_lookup/Reshape/_74]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
      (1) Resource exhausted:  OOM when allocating tensor with shape[300,300] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node model/lstm/while/body/_1/model/lstm/while/lstm_cell/split}}]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
    0 successful operations.
    0 derived errors ignored. [Op:__inference_train_function_4691]
    
    Function call stack:
    train_function -> train_function
1/10纪元
---------------------------------------------------------------------------
ResourceExhaustedError回溯(最近一次调用上次)
在里面
3.
4.模型拟合([np.asarray(X_列)、astype(np.float32)、np.asarray(X_列、zusatz)、astype(np.float32)],
---->5列车,历次=10,批量=10),
6#验证#数据=[[X#检验,X#检验,y#检验],等级重量=等级重量)
~\.conda\envs\tf gpu\lib\site pac
kages\tensorflow\python\keras\engine\training.py拟合(self、x、y、批大小、历元、冗余、回调、验证拆分、验证数据、随机、类权重、样本权重、初始历元、每历元的步骤、验证步骤、验证批大小、验证频率、最大队列大小、工作人员、使用多处理)
1098(r=1):
1099回拨。列车上批次开始(步骤)
->1100 tmp_日志=self.train_函数(迭代器)
1101如果数据处理器应同步:
1102 context.async_wait()
调用中的~\.conda\envs\tf gpu\lib\site packages\tensorflow\python\eager\def\u function.py(self,*args,**kwds)
826 tracing\u count=self.experimental\u get\u tracing\u count()
827,trace.trace(self.\u name)为tm:
-->828结果=self.\u调用(*args,**kwds)
829 compiler=“xla”如果是self.\u experimental\u编译else“nonXla”
830 new_tracing_count=self.experimental_get_tracing_count()
调用中的~\.conda\envs\tf gpu\lib\site packages\tensorflow\python\eager\def\u function.py(self,*args,**kwds)
886#提升成功,因此变量已初始化,我们可以运行
887#无状态函数。
-->888返回自我。_无状态_fn(*args,**kwds)
889其他:
890 u,u,u,过滤的u平面参数=\
调用中的~\.conda\envs\tf gpu\lib\site packages\tensorflow\python\eager\function.py(self,*args,**kwargs)
2941过滤的参数)=自身。可能定义函数(参数,kwargs)
2942返回图\函数。\调用\平面(
->2943过滤的参数,捕获的输入=图形函数。捕获的输入)#pylint:disable=受保护的访问
2944
2945@property
调用平面中的~\.conda\envs\tf gpu\lib\site packages\tensorflow\python\eager\function.py(self、args、捕获的输入、取消管理器)
1917年,没有录像带在看;跳到运行函数。
1918返回self.\u构建\u调用\u输出(self.\u推断\u函数.call(
->1919 ctx,args,取消管理器=取消管理器)
1920向前\向后=自。\选择向前\和向后\功能(
1921年的今天,
调用中的~\.conda\envs\tf gpu\lib\site packages\tensorflow\python\eager\function.py(self、ctx、args、cancellation\u manager)
558输入=参数,
559 attrs=attrs,
-->560立方英尺=立方英尺)
561其他:
562输出=execute.execute_与_取消(
快速执行中的~\.conda\envs\tf gpu\lib\site packages\tensorflow\python\eager\execute.py(op\u名称、num\u输出、输入、属性、ctx、名称)
58 ctx.确保_已初始化()
59张量=pywrap\u tfe.tfe\u Py\u Execute(ctx.\u句柄、设备名称、操作名称、,
--->60个输入、属性、数量输出)
61除堆芯外,其他状态除外,如e:
62如果名称不是无:
ResourceExhaustedError:发现2个根错误。
(0)资源耗尽:当分配器GPU\U 0\U bfc分配形状为[300300]且类型为float on/job:localhost/replica:0/task:0/device:GPU:0的张量时,OOM
[{{node model/lstm/while/body/_1/model/lstm/while/lstm_cell/split}}]
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息。
[[gradient_tape/model/embedding/embedding_lookup/Reforme/_74]]
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息。
(1) 资源耗尽:分配程序GPU\U 0\bfc分配形状为[300300]且类型为float on/job:localhost/replica:0/task:0/device:GPU:0的tensor时OOM
[{{node model/lstm/while/body/_1/model/lstm/while/lstm_cell/split}}]
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息。
0成功的操作。
忽略0个派生错误。[Op:\u in]
Epoch 1/10
---------------------------------------------------------------------------
ResourceExhaustedError                    Traceback (most recent call last)
<ipython-input-27-10813edc74c8> in <module>
      3 
      4 model.fit([np.asarray(X_train).astype(np.float32), np.asarray(X_train_zusatz).astype(np.float32)], 
----> 5           y_train, epochs=10, batch_size=10)#, 
      6           #validation_data=[[X_test, X_test_zusatz], y_test], class_weight=class_weight)

~\.conda\envs\tf-gpu\lib\site-pac

kages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
       1098                 _r=1):
       1099               callbacks.on_train_batch_begin(step)
    -> 1100               tmp_logs = self.train_function(iterator)
       1101               if data_handler.should_sync:
       1102                 context.async_wait()
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds)
        826     tracing_count = self.experimental_get_tracing_count()
        827     with trace.Trace(self._name) as tm:
    --> 828       result = self._call(*args, **kwds)
        829       compiler = "xla" if self._experimental_compile else "nonXla"
        830       new_tracing_count = self.experimental_get_tracing_count()
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds)
        886         # Lifting succeeded, so variables are initialized and we can run the
        887         # stateless function.
    --> 888         return self._stateless_fn(*args, **kwds)
        889     else:
        890       _, _, _, filtered_flat_args = \
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\function.py in __call__(self, *args, **kwargs)
       2941        filtered_flat_args) = self._maybe_define_function(args, kwargs)
       2942     return graph_function._call_flat(
    -> 2943         filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
       2944 
       2945   @property
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
       1917       # No tape is watching; skip to running the function.
       1918       return self._build_call_outputs(self._inference_function.call(
    -> 1919           ctx, args, cancellation_manager=cancellation_manager))
       1920     forward_backward = self._select_forward_and_backward_functions(
       1921         args,
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\function.py in call(self, ctx, args, cancellation_manager)
        558               inputs=args,
        559               attrs=attrs,
    --> 560               ctx=ctx)
        561         else:
        562           outputs = execute.execute_with_cancellation(
    
    ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
         58     ctx.ensure_initialized()
         59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
    ---> 60                                         inputs, attrs, num_outputs)
         61   except core._NotOkStatusException as e:
         62     if name is not None:
    
    ResourceExhaustedError: 2 root error(s) found.
      (0) Resource exhausted:  OOM when allocating tensor with shape[300,300] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node model/lstm/while/body/_1/model/lstm/while/lstm_cell/split}}]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
         [[gradient_tape/model/embedding/embedding_lookup/Reshape/_74]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
      (1) Resource exhausted:  OOM when allocating tensor with shape[300,300] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node model/lstm/while/body/_1/model/lstm/while/lstm_cell/split}}]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    
    0 successful operations.
    0 derived errors ignored. [Op:__inference_train_function_4691]
    
    Function call stack:
    train_function -> train_function