Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/354.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python ResourceExhaustedError(回溯见上文):分配形状[3,3256512]张量和类型float时的OOM_Python_Tensorflow_Keras_Deep Learning - Fatal编程技术网

Python ResourceExhaustedError(回溯见上文):分配形状[3,3256512]张量和类型float时的OOM

Python ResourceExhaustedError(回溯见上文):分配形状[3,3256512]张量和类型float时的OOM,python,tensorflow,keras,deep-learning,Python,Tensorflow,Keras,Deep Learning,GPU:GTX980M-8GB CPU:2.7GHZ 8核 内存:16GB import keras from keras.models import Sequential from keras.layers import Dense,Flatten,Dropout from keras.layers.convolutional import Conv2D,MaxPooling2D #from sklearn.model_selection import train_test_split imp

GPU:GTX980M-8GB CPU:2.7GHZ 8核
内存:16GB

import keras
from keras.models import Sequential
from keras.layers import Dense,Flatten,Dropout
from keras.layers.convolutional import Conv2D,MaxPooling2D
#from sklearn.model_selection import train_test_split
import keras.optimizers
import gc
label=pd.read_csv('trainLabels.csv')
label=label.sort_values(by=['image'])
model = Sequential()
model.add(Conv2D(64,(3,3),strides=(1,1),input_shape=(512,512,3),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(64,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(128,(3,2),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(128,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(256,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(Conv2D(512,(3,3),strides=(1,1),padding='same',activation='relu',kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(4096,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1,activation='relu'))
opt = keras.optimizers.Adagrad(lr=0.01)
model.compile(loss='mean_squared_error', optimizer=opt)
#ID=train_test_split(list(range(1000)),test_size=0.2,stratify=label['level'].iloc[0:1000])
df=np.array(df)
model.fit(df, label['level'].iloc[0:1000],epochs=100, batch_size=1)
gc.collect()
错误消息:

df.shape
    Out[78]: (1000, 512, 512, 3)
ResourceExhaustedError(回溯见上文):分配形状[3,3256512]的张量和类型float时的OOM
[[Node:training_9/Adagrad/zeros_14=Const[dtype=DT_FLOAT,value=Tensor,_device=“/job:localhost/replica:0/task:0/device:GPU:0”]()]
图像为512 X 512,并通过一种标准vgg19
我只是想问,这个问题是由硬件缺陷引起的,不能用其他方法解决吗?i、 e.我无法使用8GB GPU通过vgg19处理512 X 512图像。我有点怀疑,因为在增加批量大小时,使用11GB GPU肯定会出现同样的错误。或者应该有问题或其他解决方案。

我不认为该型号太大,不适合使用8GB内存的GPU

在大多数情况下,有一个进程使用GPU中的大部分内存资源

您可以检查哪个进程占用了大部分GPU内存

ResourceExhaustedError (see above for traceback): OOM when allocating tensor of shape [3,3,256,512] and type float
     [[Node: training_9/Adagrad/zeros_14 = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [3,3,256,512] values: [[[0 0 0]]]...>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
如果在
processname
中看到类似于
python
的内容,只需使用
kill-9[processid]


该进程可能是您执行的模型,它没有被终止,并且消耗了大部分GPU内存。

问题似乎是网络对于该输入大小来说太大了。因此,它无法为给定的大小/网络分配内存。可能是该网络适合11GB GPU,但不适合8GB GPU。尝试从您的网络中删除几个conv2D层或减小输入图像大小(至128x128或64x64),看看它是否有效。我尝试了256 X 256它确实有效,但在预测准确度交易中很有可能。我只是想知道哪种机器能完成这种任务。
nvidia-smi