Python Tensorflow InvalidArgumeInterror：在自定义丢失函数之后、纪元结束之前的不兼容形状_Python_Tensorflow_Machine Learning_Loss Function

Python Tensorflow InvalidArgumeInterror：在自定义丢失函数之后、纪元结束之前的不兼容形状

python tensorflow machine-learning

Python Tensorflow InvalidArgumeInterror：在自定义丢失函数之后、纪元结束之前的不兼容形状,python,tensorflow,machine-learning,loss-function,Python,Tensorflow,Machine Learning,Loss Function,我从早上开始调试这个问题，一直没有取得任何进展。我正在尝试训练一个批量大于1的模型。该代码适用于batch_size=1，但不适用于任何较大的数字。如果可能的话，请帮忙自定义丢失函数的代码为（所有打印语句都是在调试过程中添加的）：这是我用来初始化模型的 model = Model(inputs = inputs, outputs = mobilenet_output, name="main_model") model.compile(optimizer = 'adam'

我从早上开始调试这个问题，一直没有取得任何进展。我正在尝试训练一个批量大于1的模型。该代码适用于batch_size=1，但不适用于任何较大的数字。如果可能的话，请帮忙

自定义丢失函数的代码为（所有打印语句都是在调试过程中添加的）：

这是我用来初始化模型的

model = Model(inputs = inputs, outputs = mobilenet_output, name="main_model")

model.compile(optimizer = 'adam'
              , loss =mobilenet_loss
              , metrics = [ 'accuracy']
              
              )

model.fit(x = train_generator, steps_per_epoch = 1, epochs=2, batch_size=batch_size, shuffle=False)

错误消息：

Epoch 1/2
ytrue = [[2]
 [2]] [2 1]  y_pred shape = [2 31 31 3]
one_class_shape = [2 31 31 1]
y_true_shape -  [2 1]
y_true_compare_0 = [2 1]
y_true_compare_1 = [2 1]
y_true_compare_0 [[0]
 [0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [2 31 31 3] TensorShape([2, 31, 31, 3])
error shape =  [2 31 31 3]
loss shape =  [] -21462.2773
ytrue = [[2]
 [2]] [2 1]  y_pred shape = [2 31 31 3]
one_class_shape = [2 31 31 1]
y_true_shape -  [2 1]
y_true_compare_0 = [2 1]
y_true_compare_1 = [2 1]
y_true_compare_0 [[0]
 [0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [2 31 31 3] TensorShape([2, 31, 31, 3])
error shape =  [2 31 31 3]
loss shape =  [] -8.79764557e-05
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-39-739652b805f0> in <module>
----> 1 model.fit(x = train_generator, steps_per_epoch = 1, epochs=2, batch_size=batch_size, shuffle=False)

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1098                 _r=1):
   1099               callbacks.on_train_batch_begin(step)
-> 1100               tmp_logs = self.train_function(iterator)
   1101               if data_handler.should_sync:
   1102                 context.async_wait()

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
    826     tracing_count = self.experimental_get_tracing_count()
    827     with trace.Trace(self._name) as tm:
--> 828       result = self._call(*args, **kwds)
    829       compiler = "xla" if self._experimental_compile else "nonXla"
    830       new_tracing_count = self.experimental_get_tracing_count()

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
    886         # Lifting succeeded, so variables are initialized and we can run the
    887         # stateless function.
--> 888         return self._stateless_fn(*args, **kwds)
    889     else:
    890       _, _, _, filtered_flat_args = \

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
   2941        filtered_flat_args) = self._maybe_define_function(args, kwargs)
   2942     return graph_function._call_flat(
-> 2943         filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
   2944 
   2945   @property

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1917       # No tape is watching; skip to running the function.
   1918       return self._build_call_outputs(self._inference_function.call(
-> 1919           ctx, args, cancellation_manager=cancellation_manager))
   1920     forward_backward = self._select_forward_and_backward_functions(
   1921         args,

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
    558               inputs=args,
    559               attrs=attrs,
--> 560               ctx=ctx)
    561         else:
    562           outputs = execute.execute_with_cancellation(

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Incompatible shapes: [2,1] vs. [2,31,31]
     [[node Equal (defined at <ipython-input-39-739652b805f0>:1) ]] [Op:__inference_train_function_26226]

Function call stack:
train_function

找到了我在这里发布的问题的解决方案

我将损耗重塑为形状[批次、最终转换宽度、最终转换高度、特征类]

我用tf.GradientTape和optimizer.apply_gradients函数创建了一个定制的训练循环

Epoch 1/2
ytrue = [[2]
 [2]] [2 1]  y_pred shape = [2 31 31 3]
one_class_shape = [2 31 31 1]
y_true_shape -  [2 1]
y_true_compare_0 = [2 1]
y_true_compare_1 = [2 1]
y_true_compare_0 [[0]
 [0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [2 31 31 3] TensorShape([2, 31, 31, 3])
error shape =  [2 31 31 3]
loss shape =  [] -21462.2773
ytrue = [[2]
 [2]] [2 1]  y_pred shape = [2 31 31 3]
one_class_shape = [2 31 31 1]
y_true_shape -  [2 1]
y_true_compare_0 = [2 1]
y_true_compare_1 = [2 1]
y_true_compare_0 [[0]
 [0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [2 31 31 3] TensorShape([2, 31, 31, 3])
error shape =  [2 31 31 3]
loss shape =  [] -8.79764557e-05
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-39-739652b805f0> in <module>
----> 1 model.fit(x = train_generator, steps_per_epoch = 1, epochs=2, batch_size=batch_size, shuffle=False)

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1098                 _r=1):
   1099               callbacks.on_train_batch_begin(step)
-> 1100               tmp_logs = self.train_function(iterator)
   1101               if data_handler.should_sync:
   1102                 context.async_wait()

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
    826     tracing_count = self.experimental_get_tracing_count()
    827     with trace.Trace(self._name) as tm:
--> 828       result = self._call(*args, **kwds)
    829       compiler = "xla" if self._experimental_compile else "nonXla"
    830       new_tracing_count = self.experimental_get_tracing_count()

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
    886         # Lifting succeeded, so variables are initialized and we can run the
    887         # stateless function.
--> 888         return self._stateless_fn(*args, **kwds)
    889     else:
    890       _, _, _, filtered_flat_args = \

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
   2941        filtered_flat_args) = self._maybe_define_function(args, kwargs)
   2942     return graph_function._call_flat(
-> 2943         filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
   2944 
   2945   @property

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1917       # No tape is watching; skip to running the function.
   1918       return self._build_call_outputs(self._inference_function.call(
-> 1919           ctx, args, cancellation_manager=cancellation_manager))
   1920     forward_backward = self._select_forward_and_backward_functions(
   1921         args,

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
    558               inputs=args,
    559               attrs=attrs,
--> 560               ctx=ctx)
    561         else:
    562           outputs = execute.execute_with_cancellation(

~/opt/anaconda3/envs/bigthinx/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Incompatible shapes: [2,1] vs. [2,31,31]
     [[node Equal (defined at <ipython-input-39-739652b805f0>:1) ]] [Op:__inference_train_function_26226]

Function call stack:
train_function

Epoch 1/2
ytrue = [[2]] [1 1]  y_pred shape = [1 31 31 3]
one_class_shape = [1 31 31 1]
y_true_shape -  [1 1]
y_true_compare_0 = [1 1]
y_true_compare_1 = [1 1]
y_true_compare_0 [[0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [1 31 31 3] TensorShape([1, 31, 31, 3])
error shape =  [1 31 31 3]
loss shape =  [] -1222.78308
ytrue = [[2]] [1 1]  y_pred shape = [1 31 31 3]
one_class_shape = [1 31 31 1]
y_true_shape -  [1 1]
y_true_compare_0 = [1 1]
y_true_compare_1 = [1 1]
y_true_compare_0 [[0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [1 31 31 3] TensorShape([1, 31, 31, 3])
error shape =  [1 31 31 3]
loss shape =  [] -8.76188278e-06
1/1 [==============================] - 15s 15s/step - loss: -1222.7831 - mobnet_features_loss: -1222.7831 - Predictions_loss: -8.7619e-06 - mobnet_features_accuracy: 0.3777 - Predictions_accuracy: 0.3777
Epoch 2/2
ytrue = [[2]] [1 1]  y_pred shape = [1 31 31 3]
one_class_shape = [1 31 31 1]
y_true_shape -  [1 1]
y_true_compare_0 = [1 1]
y_true_compare_1 = [1 1]
y_true_compare_0 [[0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [1 31 31 3] TensorShape([1, 31, 31, 3])
error shape =  [1 31 31 3]
loss shape =  [] -4929.66
ytrue = [[2]] [1 1]  y_pred shape = [1 31 31 3]
one_class_shape = [1 31 31 1]
y_true_shape -  [1 1]
y_true_compare_0 = [1 1]
y_true_compare_1 = [1 1]
y_true_compare_0 [[0]]
div_equals_0_check shape =  0
div_equals_1_check shape =  1
after y_true_array = [1 31 31 3] TensorShape([1, 31, 31, 3])
error shape =  [1 31 31 3]
loss shape =  [] 4.91845421e-05
1/1 [==============================] - 12s 12s/step - loss: -4929.6602 - mobnet_features_loss: -4929.6602 - Predictions_loss: 4.9185e-05 - mobnet_features_accuracy: 0.3049 - Predictions_accuracy: 0.3049

@tf.function
def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        y_pred = model(x)
        tf.print("in grad function y_pred shape = ", tf.type_spec_from_value(y_pred))
        loss_value = mobilenet_loss(targets, y_pred)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)

num_epochs = 2
optimizer = tf.keras.optimizers.Adam()
avg_loss = tf.keras.metrics.Mean()

for epoch in range(num_epochs):
    x, y = next(train_generator)
    # Optimize the model

    loss_value, gradients = grad(model, x, y)

    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    avg_loss(loss_value)
    print(" epoch %d/%d, loss=%.4f "%(epoch +1, num_epochs, avg_loss.result()))