Python InvalidArgumentError:[0]中的dim(0)和[1]中的dim(0)必须相同:[1125150]与[32150125]

Python InvalidArgumentError:[0]中的dim(0)和[1]中的dim(0)必须相同:[1125150]与[32150125],python,numpy,tensorflow,keras,keras-layer,Python,Numpy,Tensorflow,Keras,Keras Layer,我正在尝试创建一个自定义层合并2个源。我在[0]中收到错误InvalidArgumentError:dim0和[1]。dim0必须与[1125150]和[32150125]相同。如果我将batch_size设置为1,那么代码将运行,因此有[1125150]对[1150125];然而,损失并没有改变,所以仍然不是根本原因。我认为我需要使用批量大小,而不是仅仅扩展DIM class mergeLayer(L.Layer): def __init__(self, output_dim, **k

我正在尝试创建一个自定义层合并2个源。我在[0]中收到错误InvalidArgumentError:dim0和[1]。dim0必须与[1125150]和[32150125]相同。如果我将batch_size设置为1,那么代码将运行,因此有[1125150]对[1150125];然而,损失并没有改变,所以仍然不是根本原因。我认为我需要使用批量大小,而不是仅仅扩展DIM

class mergeLayer(L.Layer):
    def __init__(self, output_dim, **kwargs):
        self.output_dim = output_dim
        super(mergeLayer,self).__init__()
        self.kernel_initializer = INIT.get('uniform')

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernel = self.add_weight(name='kernel',shape=input_shape[1:],initializer=self.kernel_initializer,trainable=True)
        super(mergeLayer,self).build(input_shape) # Be sure to call this somewhere!

    def call(self, x):
        temp = K.batch_dot(tf.expand_dims(self.kernel,0),tf.transpose(x,perm=[0,2,1]))+1
        return temp
    def compute_output_shape(self, input_shape):
        return input_shape
下面是适合模型的代码。同样,如果我在这里将batch_size更改为1,我可以运行代码,但丢失保持不变

modelMerge.fit(x=[train1,train2],y=cats,epochs=100,batch_size=32,shuffle='batch')
score = modelMerge.evaluate(x=[test1,test2],y=cats,batch_size=32)
批处理大小为1时的输出

Epoch 1/100
3903/3903 [=========================] - 45s - loss: 15.7062 - acc: 0.0254
Epoch 2/100
3903/3903 [=========================] - 43s - loss: 15.7050 - acc: 0.0254
Epoch 3/100
277/3903 [=>.......................] - ETA: 42s - loss: 15.8272 - acc: 0.0181
非常感谢您的时间和帮助

更新:下面是调用mergeLayer的Keras模型结构

def buildModel_merge(numClasses):
source = L.Input(shape=(64,25,1))
x = L.Conv2D(150, (3,3), activation='relu', name='conv1a')(source)
x = L.MaxPooling2D((2,2))(x)
x = L.BatchNormalization()(x)
x = L.Conv2D(150, (3,3), activation='relu', name='conv2a')(x)
x = L.Conv2D(150, (5,5), activation='relu', name='conv3a')(x)
x = L.Dropout(0.5)(x)
#reshape into a dxN matrix
x = L.Reshape((125,150))(x)
x = mergeLayer(100)(x)

source2 = L.Input(shape=(30,30,30,1))
x2 = L.Conv3D(32,(5,5,5),strides=(2,2,2),activation='relu',name='conv1b')(source2)
x2 = L.Dropout(0.2)(x2)
x2 = L.Conv3D(32,(3,3,3),activation='relu',name='conv2b')(x2)
x2 = L.MaxPooling3D(pool_size=(2,2,2),name='pool2b')(x2)
x2 = L.Dropout(0.3)(x2)
#reshape into a dxM matrix
x2 = L.Reshape((125,32))(x2)
x2 = mergeLayer(100)(x2)

#x = L.Multiply(x, x2)(x)
x = L.Multiply()([x,x2])

x = L.Flatten()(x)
x = L.Dense(400, activation='relu', name='dense1')(x) # Is relu used here?
x = L.Dropout(0.5)(x)
classify = L.Dense(numClasses, activation='softmax', name='dense2')(x)

model = M.Model(inputs=[source,source2],outputs=classify)
optimizer= O.SGD(momentum=0.02)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['acc'])

return model

下面是代码中的一些更正:

您不需要output_dim和**kwargs参数 我已经用额外的维度定义了它,而不是在内核上使用expand_dims,但是您的keras版本的行为似乎与我的不同,所以使用其他代码行。 主要问题:batch_dot需要两个具有相同批量大小的张量,这意味着:第一个维度必须相同 通过重复内核以适应x的批量大小解决了这个问题 将所有tf函数与keras后端函数交换,将keras.backend导入为K-这不是问题,但您可以将此解决方案移植到其他受支持的后端。


非常感谢。我现在得到的错误类型错误:uuu init_uuuu恰好接受调用x=mergeLayer100xIf时给定的1个参数2,如果我更新init以接受两个参数,那么得到这个错误:self.kernel=self.add_weightname='kernel',shape=1,+input_shape[1:],initializer=self.kernel\u initializer,trainable=True TypeError:只能将元组not TensorShape连接到调用mergeLayer的元组化Keras模型结构。非常感谢你的帮助好的。。。这个错误确实是内存不足。这与这个问题无关。完全一样,即使是小小数,也没有任何变化?有时我会发现差异很小,以至于它们不会出现在显示的数字中,这可能是一个太小的学习率。您可以尝试提高SGD的学习率-通常我更喜欢使用optimizer='adam',它更能自动适应需要。
class mergeLayer(Layer):

    #your init doesn't need output_dim and **kwargs
    def __init__(self):
        super(mergeLayer,self).__init__()
        self.kernel_initializer = INIT.get('uniform')

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernel = self.add_weight(
                          name='kernel',

                          #corrected shape to avoid expand_dims
                          shape=(1,)+input_shape[1:],
                              #alternative:
                              #shape = input_shape[1:],
                          initializer=self.kernel_initializer,trainable=True)

        super(mergeLayer,self).build(input_shape) # Be sure to call this somewhere!

    def call(self, x):
        #take a tensor of ones with the same shape as x
        form = K.ones_like(x)

        #multiplies the kernel to match the batch size of x
        kernel = form * self.kernel
            #alternative:
            #kernel = form * K.expand_dims(self.kernel,0)

        #used K.permute_dimensions instead of tf
        temp = K.batch_dot(kernel,K.permute_dimensions(x,(0,2,1)))+1
        return temp

    def compute_output_shape(self, input_shape):
        return input_shape