Tensorflow Keras使用了太多内存

Tensorflow Keras使用了太多内存,tensorflow,keras,Tensorflow,Keras,我有一个keras(带有tensorflow后端)模型,其定义如下: INPUT_SHAPE = [4740, 3540, 1] model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=INPUT_SHAPE)) model.add(Conv2D(2, (4, 4), strides=(1, 1),

我有一个keras(带有tensorflow后端)模型,其定义如下:

INPUT_SHAPE = [4740, 3540, 1]

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=INPUT_SHAPE))
model.add(Conv2D(2, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(4, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(8, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(16, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(32, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])
该模型只有37506个可训练参数。然而,如果批处理大小大于1,它会在model.fit()上耗尽K80的12GB vram资源。 为什么这个模型需要这么多内存? 如何正确计算内存需求? 来自的函数为我提供了一批中每1个元素2.15 GB的容量。所以至少我能做一批5个

编辑:model.summary()


第一层的输出形状是B*4738*3538*32(B是批量大小),这将占用大约1GB*B的内存。梯度和其他激活可能也需要一些记忆。也许增加第一层的步幅会有所帮助。

你如何进行实验?您确定在编译单个模型后释放了内存吗?@MarcinMożejko是的,我确定释放了内存,我在运行model.fit()之前检查了nvidia smi。你说的是什么意思?例如,如果你在jupyter笔记本电脑上运行你的培训,你可能会在垃圾收集和释放旧型号的内存方面遇到问题。@MarcinMożejko我确实尝试重新启动内核,并在运行fit之前重新启动笔记本电脑。你能分享model.summary()的输出吗?谢谢,实际上我不该把第一层弄得这么深。减少这一点很有帮助。步幅是多少?
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 4738, 3538, 32)    320       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 4735, 3535, 2)     1026      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1183, 883, 2)      0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 1180, 880, 4)      132       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 295, 220, 4)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 292, 217, 8)       520       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 73, 54, 8)         0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 70, 51, 16)        2064      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 17, 12, 16)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 14, 9, 32)         8224      
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 3, 2, 32)          0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 3, 2, 32)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 192)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               24704     
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 4)                 516       
=================================================================
Total params: 37,506
Trainable params: 37,506
Non-trainable params: 0
_________________________________________________________________