Python 训练样式GAN2网络时发生超出范围错误

Python 训练样式GAN2网络时发生超出范围错误,python,tensorflow,machine-learning,stylegan,Python,Tensorflow,Machine Learning,Stylegan,在使用repo训练我的第一个网络时,我遇到了很多问题。由于我只有11GB的VRAM可用,所以在更改为2个较小的GPU批处理大小后,在返回这些OutofRange错误之前,培训成功地完成了1-4次计时 Ryzen 3950x RTX 2080ti 32GB DDR4内存 Windows 10 Tensorflow gpu 1.4 Building TensorFlow graph... Initializing logs... Training for 25000 kimg... tick 0

在使用repo训练我的第一个网络时,我遇到了很多问题。由于我只有11GB的VRAM可用,所以在更改为2个较小的GPU批处理大小后,在返回这些OutofRange错误之前,培训成功地完成了1-4次计时

Ryzen 3950x
RTX 2080ti
32GB DDR4内存
Windows 10
Tensorflow gpu 1.4

Building TensorFlow graph...
Initializing logs...
Training for 25000 kimg...

tick 0     kimg 10065.1  lod 0.00  minibatch 32   time 1m 17s       sec/tick 77.4    sec/kimg 605.07  maintenance 0.0    gpumem 8.6
Traceback (most recent call last):
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
    return fn(*args)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
  (0) Out of range: End of sequence
     [[{{node GPU0/DataFetch/IteratorGetNext}}]]
     [[GPU0/DataFetch/UpscaleLOD/Cast/_5109]]
  (1) Out of range: End of sequence
     [[{{node GPU0/DataFetch/IteratorGetNext}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_training.py", line 202, in <module>
    main()
  File "run_training.py", line 197, in main
    run(**vars(args))
  File "run_training.py", line 128, in run
    dnnlib.submit_run(**kwargs)
  File "C:\ML\stylegan2dv\dnnlib\submission\submit.py", line 343, in submit_run
    return farm.submit(submit_config, host_run_dir)
  File "C:\ML\stylegan2dv\dnnlib\submission\internal\local.py", line 22, in submit
    return run_wrapper(submit_config)
  File "C:\ML\stylegan2dv\dnnlib\submission\submit.py", line 280, in run_wrapper
    run_func_obj(**submit_config.run_func_kwargs)
  File "C:\ML\stylegan2dv\training\training_loop.py", line 308, in training_loop
    tflib.run(data_fetch_op, feed_dict)
  File "C:\ML\stylegan2dv\dnnlib\tflib\tfutil.py", line 31, in run
    return tf.get_default_session().run(*args, **kwargs)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
    run_metadata_ptr)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
    run_metadata)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
  (0) Out of range: End of sequence
     [[node GPU0/DataFetch/IteratorGetNext (defined at C:\ML\stylegan2dv\training\dataset.py:136) ]]
     [[GPU0/DataFetch/UpscaleLOD/Cast/_5109]]
  (1) Out of range: End of sequence
     [[node GPU0/DataFetch/IteratorGetNext (defined at C:\ML\stylegan2dv\training\dataset.py:136) ]]
0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node GPU0/DataFetch/IteratorGetNext:
 Dataset/IteratorV2 (defined at C:\ML\stylegan2dv\training\dataset.py:119)

Input Source operations connected to node GPU0/DataFetch/IteratorGetNext:
 Dataset/IteratorV2 (defined at C:\ML\stylegan2dv\training\dataset.py:119)

Original stack trace for 'GPU0/DataFetch/IteratorGetNext':
  File "run_training.py", line 202, in <module>
    main()
  File "run_training.py", line 197, in main
    run(**vars(args))
  File "run_training.py", line 128, in run
    dnnlib.submit_run(**kwargs)
  File "C:\ML\stylegan2dv\dnnlib\submission\submit.py", line 343, in submit_run
    return farm.submit(submit_config, host_run_dir)
  File "C:\ML\stylegan2dv\dnnlib\submission\internal\local.py", line 22, in submit
    return run_wrapper(submit_config)
  File "C:\ML\stylegan2dv\dnnlib\submission\submit.py", line 280, in run_wrapper
    run_func_obj(**submit_config.run_func_kwargs)
  File "C:\ML\stylegan2dv\training\training_loop.py", line 208, in training_loop
    reals_write, labels_write = training_set.get_minibatch_tf()
  File "C:\ML\stylegan2dv\training\dataset.py", line 136, in get_minibatch_tf
    return self._tf_iterator.get_next()
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 426, in get_next
    output_shapes=self._structure._flat_shapes, name=name)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 1974, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
    op_def=op_def)
  File "C:\Users\TE 1\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()
构建TensorFlow图。。。
正在初始化日志。。。
培训25000金。。。
勾选0 kimg 10065.1 lod 0.00小批量32次1m 17s秒/勾选77.4秒/kimg 605.07维护0.0 gpumem 8.6
回溯(最近一次呼叫最后一次):
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第1356行,在调用中
返回fn(*args)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第1341行,位于\u run\u fn
选项、提要、获取列表、目标列表、运行元数据)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第1429行,位于调用会话运行中
运行(元数据)
tensorflow.python.framework.errors\u impl.OutOfRangeError:找到2个根错误。
(0)超出范围:序列结束
[{{node GPU0/DataFetch/IteratorGetNext}}]]
[[GPU0/DataFetch/UpscaleLOD/Cast/_5109]]
(1) 超出范围:序列结束
[{{node GPU0/DataFetch/IteratorGetNext}}]]
0成功的操作。
忽略0个派生错误。
在处理上述异常期间,发生了另一个异常:
回溯(最近一次呼叫最后一次):
文件“run_training.py”,第202行,在
main()
文件“run_training.py”,第197行,主目录
运行(**变量(args))
文件“run_training.py”,运行中第128行
dnnlib.提交运行(**kwargs)
文件“C:\ML\stylegan2dv\dnnlib\submission\submit.py”,第343行,处于提交运行状态
return farm.submit(提交配置、主机运行目录)
文件“C:\ML\stylegan2dv\dnnlib\submission\internal\local.py”,第22行,在submit中
返回运行包装(提交配置)
文件“C:\ML\stylegan2dv\dnnlib\submission\submit.py”,第280行,在run\U包装中
运行函数对象(**提交配置运行函数)
文件“C:\ML\stylegan2dv\training\training\u loop.py”,第308行,在training\u loop中
运行(数据提取操作、提要记录)
文件“C:\ML\stylegan2dv\dnnlib\tflib\tfutil.py”,第31行,正在运行
返回tf.get_default_session().run(*args,**kwargs)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第950行,正在运行
运行_元数据_ptr)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第1173行,正在运行
feed_dict_tensor、options、run_元数据)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第1350行,在运行中
运行(元数据)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\client\session.py”,第1370行,在调用中
提升类型(e)(节点定义、操作、消息)
tensorflow.python.framework.errors\u impl.OutOfRangeError:找到2个根错误。
(0)超出范围:序列结束
[[node GPU0/DataFetch/IteratorGetNext(定义于C:\ML\stylegan2dv\training\dataset.py:136)]]
[[GPU0/DataFetch/UpscaleLOD/Cast/_5109]]
(1) 超出范围:序列结束
[[node GPU0/DataFetch/IteratorGetNext(定义于C:\ML\stylegan2dv\training\dataset.py:136)]]
0成功的操作。
忽略0个派生错误。
错误可能源于输入操作。
连接到节点GPU0/DataFetch/IteratorGetNext的输入源操作:
Dataset/IteratorV2(定义在C:\ML\stylegan2dv\training\Dataset.py:119)
连接到节点GPU0/DataFetch/IteratorGetNext的输入源操作:
Dataset/IteratorV2(定义在C:\ML\stylegan2dv\training\Dataset.py:119)
“GPU0/DataFetch/IteratorGetNext”的原始堆栈跟踪:
文件“run_training.py”,第202行,在
main()
文件“run_training.py”,第197行,主目录
运行(**变量(args))
文件“run_training.py”,运行中第128行
dnnlib.提交运行(**kwargs)
文件“C:\ML\stylegan2dv\dnnlib\submission\submit.py”,第343行,处于提交运行状态
return farm.submit(提交配置、主机运行目录)
文件“C:\ML\stylegan2dv\dnnlib\submission\internal\local.py”,第22行,在submit中
返回运行包装(提交配置)
文件“C:\ML\stylegan2dv\dnnlib\submission\submit.py”,第280行,在run\U包装中
运行函数对象(**提交配置运行函数)
文件“C:\ML\stylegan2dv\training\training\u loop.py”,第208行,在training\u loop中
reals\u write,labels\u write=training\u set.get\u minibatch\u tf()
文件“C:\ML\stylegan2dv\training\dataset.py”,第136行,在get\u minibatch\u tf中
返回self.\u tf\u迭代器.get\u next()
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\data\ops\iterator\u ops.py”,第426行,在get\u next中
输出形状=自身。\结构。\平面形状,名称=名称)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\ops\gen\u dataset\u ops.py”,第1974行,在迭代器\u get\u next中
输出形状=输出形状,名称=名称)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\framework\op\u def\u library.py”,第788行,位于“应用”op\u helper中
op_def=op_def)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\util\deprecation.py”,第507行,在新函数中
返回函数(*args,**kwargs)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\framework\ops.py”,第3616行,在create\u op
op_def=op_def)
文件“C:\Users\TE 1\AppData\Local\Programs\Python\36\lib\site packages\tensorflow\Python\framework\ops.py”,第2005行,在u init中__
self.\u traceback=tf\u stack.extract\u stack()