InceptionResnetV2干块keras实现与原始文件中的不匹配?

InceptionResnetV2干块keras实现与原始文件中的不匹配?,keras,deep-learning,neural-network,conv-neural-network,Keras,Deep Learning,Neural Network,Conv Neural Network,我一直在尝试将来自的模型摘要与他们论文中指定的模型摘要进行比较,但在过滤器concat块方面似乎没有太多相似之处 模型summary()的第一行如下所示。(就我的情况而言,输入更改为512x512,但据我所知,它不会影响每层过滤器的数量,因此我们也可以使用它们来跟踪纸质代码翻译): 在的图3中(下图),显示了如何为InceptionV4和InceptionResnetV2形成阀杆块。在图3中,STEM块中有三个过滤器串联,但在我上面展示的输出中,串联似乎是顺序MaxPooling或类似的混合(第

我一直在尝试将来自的模型摘要与他们论文中指定的模型摘要进行比较,但在过滤器concat块方面似乎没有太多相似之处

模型
summary()
的第一行如下所示。(就我的情况而言,输入更改为512x512,但据我所知,它不会影响每层过滤器的数量,因此我们也可以使用它们来跟踪纸质代码翻译):

在的图3中(下图),显示了如何为InceptionV4和InceptionResnetV2形成阀杆块。在图3中,STEM块中有三个过滤器串联,但在我上面展示的输出中,串联似乎是顺序MaxPooling或类似的混合(第一个串联应该出现在
max\u poolig2d\u 1
之后)。它会像串联一样增加过滤器的数量,但不会进行串联。过滤器似乎是按顺序放置的!有人知道这个输出中发生了什么吗?它的作用与论文中描述的相同

为了进行比较,我找到了一个,它们似乎在
concatenate\u 1
中对STEM块中的第一个串联进行了筛选。以下是
summary()
的前几行的输出

因此,如本文所示,两种体系结构的第一层应该相同。或者我错过了什么

编辑:我发现,Keras的InceptionResnetV2的实现不是遵循InceptionResnetV2的干块,而是遵循InceptionResnetV1的实现(他们论文中的图14,附在下面)。在STEM块之后,它似乎很好地跟随了InceptionResnetV2的其他块

InceptionResnetV1的性能不如InceptionResnetV2(图25),因此我怀疑是否使用V1中的块而不是keras中的完整V2。我将尝试从我找到的InceptionV4中切掉茎,并将InceptionResnetV2的继续部分放入

同样的问题在tf模型github中没有解释就结束了。如果有人感兴趣,我就把它放在这里:

编辑2:出于某种原因,GoogleAI(Inception架构的创建者)在发布代码时,在中显示了“Inception-resnet-v2”的图像。但阀杆块是来自InceptionV3的,而不是本文中指定的InceptionV4。因此,要么论文是错误的,要么由于某种原因代码没有遵循论文


它可以获得类似的结果

我刚刚收到一封确认错误的电子邮件,来自谷歌高级研究科学家亚历克斯·阿莱米(Alex Alemi),他是该杂志的原始出版商。似乎在内部实验中,阀杆块被切换,释放保持这样

引用:

Dani Azemar

看来你是对的。不完全确定发生了什么,但代码 显然是真相的来源,从这个意义上说 检查点是针对也已发布的代码的。当我们 在开发架构时,我们做了大量的内部工作 实验和我想象在某个时刻,茎被切换。不 当然,我现在有时间深入挖掘,但正如我所说的 releasedcheckpoint是发布代码的检查点 通过运行评估管道来验证您自己。我同意你的看法 这看起来像是在使用原始的“盗梦空间V1”系统。 致以最良好的祝愿

亚历克斯·阿莱米

我会更新这篇文章,对这个主题进行修改

更新:Christian Szegedy,也是原论文的出版商,只是:

最初的实验和模型是在DistFalse中创建的,DistFalse是一个完全不同的预定年Tensorflow框架

TF版本是在一年后添加的,可能与原始模型存在差异,但它肯定会获得类似的结果

因此,由于它获得了相似的结果,你的实验结果大致相同

Model: "inception_resnet_v2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            (None, 512, 512, 3)  0
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 255, 255, 32) 864         input_1[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 255, 255, 32) 96          conv2d_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 255, 255, 32) 0           batch_normalization_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 253, 253, 32) 9216        activation_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 253, 253, 32) 96          conv2d_2[0][0]
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 253, 253, 32) 0           batch_normalization_2[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 253, 253, 64) 18432       activation_2[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 253, 253, 64) 192         conv2d_3[0][0]
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 253, 253, 64) 0           batch_normalization_3[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 126, 126, 64) 0           activation_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 126, 126, 80) 5120        max_pooling2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 126, 126, 80) 240         conv2d_4[0][0]
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 126, 126, 80) 0           batch_normalization_4[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 124, 124, 192 138240      activation_4[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 124, 124, 192 576         conv2d_5[0][0]
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 124, 124, 192 0           batch_normalization_5[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 61, 61, 192)  0           activation_5[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 61, 61, 64)   12288       max_pooling2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 61, 61, 64)   192         conv2d_9[0][0]
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 61, 61, 64)   0           batch_normalization_9[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 61, 61, 48)   9216        max_pooling2d_2[0][0]
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 61, 61, 96)   55296       activation_9[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 61, 61, 48)   144         conv2d_7[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 61, 61, 96)   288         conv2d_10[0][0]
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 61, 61, 48)   0           batch_normalization_7[0][0]
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 61, 61, 96)   0           batch_normalization_10[0][0]
__________________________________________________________________________________________________
average_pooling2d_1 (AveragePoo (None, 61, 61, 192)  0           max_pooling2d_2[0][0]
__________________________________________________________________________________________________
.
.
. 
many more lines
Model: "inception_v4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            (None, 512, 512, 3)  0
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 255, 255, 32) 864         input_1[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 255, 255, 32) 96          conv2d_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 255, 255, 32) 0           batch_normalization_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 253, 253, 32) 9216        activation_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 253, 253, 32) 96          conv2d_2[0][0]
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 253, 253, 32) 0           batch_normalization_2[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 253, 253, 64) 18432       activation_2[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 253, 253, 64) 192         conv2d_3[0][0]
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 253, 253, 64) 0           batch_normalization_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 126, 126, 96) 55296       activation_3[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 126, 126, 96) 288         conv2d_4[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 126, 126, 64) 0           activation_3[0][0]
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 126, 126, 96) 0           batch_normalization_4[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 126, 126, 160 0           max_pooling2d_1[0][0]
                                                                 activation_4[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 126, 126, 64) 10240       concatenate_1[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 126, 126, 64) 192         conv2d_7[0][0]
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 126, 126, 64) 0           batch_normalization_7[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 126, 126, 64) 28672       activation_7[0][0]
__________________________________________________________________________________________________
.
.
.
and many more lines