Python ValueError:形状不一致:锯(1152,10,1,10,16),但应为(1152,10,1,16)

Python ValueError:形状不一致:锯(1152,10,1,10,16),但应为(1152,10,1,16),python,tensorflow,keras,conv-neural-network,Python,Tensorflow,Keras,Conv Neural Network,我现在正在学习capsnet,并尝试将代码从本地计算机传输到colab。 该代码在本地计算机上运行良好,但在colab上尝试时出错。 ValueError:形状不一致:锯(1152,10,1,10,16),但应为(1152,10,1,16) 当我尝试其他匹配,如[3,1],我会得到以下错误。在这种情况下,x的维度返回到4,并且x[3]==y[2]。 ValueError:无法对具有轴=[3,1]的形状(1152,10,1,8)和(1152,10,8,16)的输入进行批处理。x、 形状[3]!=y

我现在正在学习capsnet,并尝试将代码从本地计算机传输到colab。 该代码在本地计算机上运行良好,但在colab上尝试时出错。 ValueError:形状不一致:锯(1152,10,1,10,16),但应为(1152,10,1,16)

当我尝试其他匹配,如[3,1],我会得到以下错误。在这种情况下,x的维度返回到4,并且x[3]==y[2]。 ValueError:无法对具有轴=[3,1]的形状(1152,10,1,8)和(1152,10,8,16)的输入进行批处理。x、 形状[3]!=y、 形状[1](8!=10)

我在函数tf.scan上找到了此错误的原因。我在电脑上安装了tensorflow 1.13。但我不知道如何修复它。请帮帮我

这是代码

class CapsuleLayer(layers.Layer):

    def __init__(self, num_capsule, dim_vector, num_routing=3,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='zeros',
                 **kwargs):
        super(CapsuleLayer, self).__init__(**kwargs)
        self.num_capsule = num_capsule
        self.dim_vector = dim_vector
        self.num_routing = num_routing
        self.kernel_initializer = initializers.get(kernel_initializer)
        self.bias_initializer = initializers.get(bias_initializer)

    def build(self, input_shape):
        assert len(input_shape) >= 3, "The input Tensor should have shape=[None, input_num_capsule, input_dim_vector]"
        self.input_num_capsule = input_shape[1]
        self.input_dim_vector = input_shape[2]

        # Transform matrix
        self.W = self.add_weight(shape=[self.input_num_capsule, self.num_capsule, self.input_dim_vector, self.dim_vector],
                                 initializer=self.kernel_initializer,
                                 name='W')
        print("the weight size in capsule layer", self.W)

        # Coupling coefficient. The redundant dimensions are just to facilitate subsequent matrix calculation.
        self.bias = self.add_weight(shape=[1, self.input_num_capsule, self.num_capsule, 1, 1],
                                    initializer=self.bias_initializer,
                                    name='bias',
                                    trainable=False)
        self.built = True

    def call(self, inputs, training=None):
        inputs_expand = K.expand_dims(K.expand_dims(inputs, 2), 2)

        inputs_tiled = K.tile(inputs_expand, [1, 1, self.num_capsule, 1, 1])
        print("call size inputs_tiled", inputs_tiled)

        # Compute `inputs * W` by scanning inputs_tiled on dimension 0. This is faster but requires Tensorflow.
        # inputs_hat.shape = [None, input_num_capsule, num_capsule, 1, dim_vector] [3, 2] [4,3]
        inputs_hat = tf.scan(lambda ac, x: K.batch_dot(x, self.W, axes=[3,2]),
                             elems=inputs_tiled,
                             initializer=K.zeros([self.input_num_capsule, self.num_capsule, 1, self.dim_vector]))
        print("result of inputs_hat", inputs_hat)

        # Routing algorithm V2. Use iteration. V2 and V1 both work without much difference on performance
        assert self.num_routing > 0, 'The num_routing should be > 0.'
        for i in range(self.num_routing):
            c = tf.nn.softmax(self.bias, dim=2)  # dim=2 is the num_capsule dimension
            # outputs.shape=[None, 1, num_capsule, 1, dim_vector]
            outputs = squash(K.sum(c * inputs_hat, 1, keepdims=True))
            print("size after squash:", outputs)

            # last iteration needs not compute bias which will not be passed to the graph any more anyway.
            if i != self.num_routing - 1:
                # self.bias = K.update_add(self.bias, K.sum(inputs_hat * outputs, [0, -1], keepdims=True))
                self.bias = tf.assign_add(self.bias, K.sum(inputs_hat * outputs, -1, keepdims=True))
                # self.bias = self.bias + K.sum(inputs_hat * outputs, -1, keepdims=True)
            # tf.summary.histogram('BigBee', self.bias)  # for debugging
        return K.reshape(outputs, [-1, self.num_capsule, self.dim_vector])

    def compute_output_shape(self, input_shape):
        print("the output shape of capslayer is:", tuple([None, self.num_capsule, self.dim_vector]))
        return tuple([None, self.num_capsule, self.dim_vector])
最后,我解决了它。 函数
tf.scan()
这里没有任何错误,但不符合我的环境。这里的
tf.scan()
的用途类似于完全连接的层

根据全连接层的定义,我们只需要修改函数,但不要使用
tf.map\u fn()
,因为这样我们会得到相同的错误

试试这个。这个函数对解决这个问题有很大帮助



希望我的解决方案也能解决您的一个问题。

我在一台机器上遇到了相同的问题,而不是在另一台机器上。在比较环境之后,我发现了2个差异,在机器上Tensorflow为2.1,Keras为2.3,误差为1.15.0,在工作环境中分别为2.2.4

首先,我降级了Tensorflow,但它不起作用

第二,我降低了Keras的评级,问题得到了解决。所以我的结论是Keras2.3破坏了这个函数