Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用tensorflow渐变带时内存不足,但只有在附加列表时才会发生_Python_Tensorflow_Gradienttape - Fatal编程技术网

Python 使用tensorflow渐变带时内存不足,但只有在附加列表时才会发生

Python 使用tensorflow渐变带时内存不足,但只有在附加列表时才会发生,python,tensorflow,gradienttape,Python,Tensorflow,Gradienttape,我一直在使用CNN制作数据集(10003253)。我正在用梯度带进行梯度计算,但它的内存一直不足。然而,如果我删除在列表中附加梯度计算的行,脚本将贯穿所有时代。我不完全确定为什么会发生这种情况,但我对tensorflow和梯度带的使用也很陌生。如有任何建议或意见,将不胜感激 #create a batch loop for x, y_true in train_dataset: #create a tape to record ac

我一直在使用CNN制作数据集(10003253)。我正在用梯度带进行梯度计算,但它的内存一直不足。然而,如果我删除在列表中附加梯度计算的行,脚本将贯穿所有时代。我不完全确定为什么会发生这种情况,但我对tensorflow和梯度带的使用也很陌生。如有任何建议或意见,将不胜感激

        #create a batch loop
    for x, y_true in train_dataset:            
        #create a tape to record actions


        with  tf.GradientTape(watch_accessed_variables=False) as tape:
            x_var = tf.Variable(x)
            tape.watch([model.trainable_variables,x_var])    

            y_pred = model(x_var,training=True)    
            tape.stop_recording()
            loss = los_func(y_true, y_pred)
        epoch_loss_avg.update_state(loss)
        epoch_accuracy.update_state(y_true, y_pred)                

        #pdb.set_trace() 
        gradients,something = tape.gradient(loss, (model.trainable_variables,x_var))
        #sa_input.append(tape.gradient(loss, x_var))
        del tape            


        #apply gradients
        sa_input.append(something)
        opti_func.apply_gradients(zip(gradients, model.trainable_variables)) 
    train_loss_results.append(epoch_loss_avg.result())
    train_accuracy_results.append(epoch_accuracy.result())

由于您是TF2的新手,建议您完成此操作。本指南涵盖TensorFlow 2.0中两种主要情况下的培训、评估和预测(推理)模型:

  • 使用内置API进行培训和验证时(例如model.fit()、model.evaluate()、model.predict())。“使用内置培训和评估循环”一节介绍了这一点
  • 使用渴望执行和GradientTape对象从头开始编写自定义循环时。“从头开始编写自己的培训和评估循环”一节介绍了这一点
  • 下面是一个程序,我在其中计算每个历元后的梯度,并将其添加到列表中。在程序结束时,为了简单起见,我将
    列表
    转换为
    数组

    代码-如果我使用多层的深层网络和较大的过滤器大小,该程序将抛出OOM错误

    # Importing dependency
    %tensorflow_version 2.x
    from tensorflow import keras
    from tensorflow.keras import backend as K
    from tensorflow.keras import datasets
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
    from tensorflow.keras.layers import BatchNormalization
    import numpy as np
    import tensorflow as tf
    
    # Import Data
    (train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
    
    # Build Model
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32,32, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(Flatten())
    model.add(Dense(64, activation='relu'))
    model.add(Dense(10))
    
    # Model Summary
    model.summary()
    
    # Model Compile 
    model.compile(optimizer='adam',
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    
    # Define the Gradient Fucntion
    epoch_gradient = []
    loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    
    # Define the Gradient Function
    @tf.function
    def get_gradient_func(model):
        with tf.GradientTape() as tape:
           logits = model(train_images, training=True)
           loss = loss_fn(train_labels, logits)    
        grad = tape.gradient(loss, model.trainable_weights)
        model.optimizer.apply_gradients(zip(grad, model.trainable_variables))
        return grad
    
    # Define the Required Callback Function
    class GradientCalcCallback(tf.keras.callbacks.Callback):
      def on_epoch_end(self, epoch, logs={}):
        grad = get_gradient_func(model)
        epoch_gradient.append(grad)
    
    epoch = 4
    
    print(train_images.shape, train_labels.shape)
    
    model.fit(train_images, train_labels, epochs=epoch, validation_data=(test_images, test_labels), callbacks=[GradientCalcCallback()])
    
    # (7) Convert to a 2 dimensiaonal array of (epoch, gradients) type
    gradient = np.asarray(epoch_gradient)
    print("Total number of epochs run:", epoch)
    
    输出-

    Model: "sequential_5"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    conv2d_12 (Conv2D)           (None, 30, 30, 32)        896       
    _________________________________________________________________
    max_pooling2d_8 (MaxPooling2 (None, 15, 15, 32)        0         
    _________________________________________________________________
    conv2d_13 (Conv2D)           (None, 13, 13, 64)        18496     
    _________________________________________________________________
    max_pooling2d_9 (MaxPooling2 (None, 6, 6, 64)          0         
    _________________________________________________________________
    conv2d_14 (Conv2D)           (None, 4, 4, 64)          36928     
    _________________________________________________________________
    flatten_4 (Flatten)          (None, 1024)              0         
    _________________________________________________________________
    dense_11 (Dense)             (None, 64)                65600     
    _________________________________________________________________
    dense_12 (Dense)             (None, 10)                650       
    =================================================================
    Total params: 122,570
    Trainable params: 122,570
    Non-trainable params: 0
    _________________________________________________________________
    (50000, 32, 32, 3) (50000, 1)
    Epoch 1/4
    1563/1563 [==============================] - 109s 70ms/step - loss: 1.7026 - accuracy: 0.4081 - val_loss: 1.4490 - val_accuracy: 0.4861
    Epoch 2/4
    1563/1563 [==============================] - 145s 93ms/step - loss: 1.2657 - accuracy: 0.5506 - val_loss: 1.2076 - val_accuracy: 0.5752
    Epoch 3/4
    1563/1563 [==============================] - 151s 96ms/step - loss: 1.1103 - accuracy: 0.6097 - val_loss: 1.1122 - val_accuracy: 0.6127
    Epoch 4/4
    1563/1563 [==============================] - 152s 97ms/step - loss: 1.0075 - accuracy: 0.6475 - val_loss: 1.0508 - val_accuracy: 0.6371
    Total number of epochs run: 4
    

    希望这能回答你的问题。快乐学习。

    你能分享一下这个模型吗?您是否尝试减少模型的可培训参数的数量,并查看问题是否已解决?可以通过添加最大池层和减少密集层来减少可训练参数的数量。你还可以将完整的代码或可复制的代码共享为Google colab链接吗?看来Tensorflow真的希望你使用他们的内置函数。我发现使用tensorflows连接函数代替附加到列表。附加到列表也可以工作,我们已经测试过了。不确定OOM是否与列表有关。那么tensorflows连接函数是否解决了您的问题。您能分享代码吗?@nauge-希望我们已经回答了您的问题。如果你对答案感到满意,请接受并投票。