理解为什么Keras和Tensorflow之间的结果不同

理解为什么Keras和Tensorflow之间的结果不同,tensorflow,Tensorflow,我目前正试图在Keras和Tensorflow中做一些工作,我偶然发现了一件我不理解的小事。如果您查看下面的代码,我将尝试通过Tensorflow会话显式地预测网络响应,或者使用model predict_on_批处理函数预测网络响应 import os import keras import numpy as np import tensorflow as tf from keras import backend as K from keras.layers import Dense, Dro

我目前正试图在Keras和Tensorflow中做一些工作,我偶然发现了一件我不理解的小事。如果您查看下面的代码,我将尝试通过Tensorflow会话显式地预测网络响应,或者使用model predict_on_批处理函数预测网络响应

import os
import keras
import numpy as np
import tensorflow as tf
from keras import backend as K
from keras.layers import Dense, Dropout, Flatten, Input
from keras.models import Model

# Try to standardize output
np.random.seed(1)
tf.set_random_seed(1)

# Building the model
inputs = Input(shape=(224,224,3))
base_model = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', \
                                    input_tensor=inputs, input_shape=(224, 224, 3))
x = base_model.get_layer("fc2").output
x = Dropout(0.5, name='model_fc_dropout')(x)
x = Dense(2048, activation='sigmoid', name='final_fc')(x)
x = Dropout(0.5, name='final_fc_dropout')(x)
predictions = Dense(1, activation='sigmoid', name='fcout')(x)
model = Model(outputs=predictions, inputs=inputs)

##################################################################
model.compile(loss='binary_crossentropy',
          optimizer=tf.train.MomentumOptimizer(learning_rate=5e-4, momentum=0.9),
          metrics=['accuracy'])

image_batch = np.random.random((64,224,224,3))

# Outputs predicted by TF
outs = [predictions]
feed_dict={inputs:image_batch,  K.learning_phase():0}

init_op = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init_op)

    outputs = sess.run(outs, feed_dict)[0]
    print outputs.flatten()

# Outputs predicted by Keras
outputs = model.predict_on_batch(image_batch)
print outputs.flatten()
我的问题是,我得到了两个不同的结果,尽管我试图通过将种子设置为1并在CPU上运行操作来消除任何类型的随机性源。即便如此,我还是得到了以下结果:

[ 0.26079229  0.26078743  0.26079154  0.26079673  0.26078942  0.26079443
  0.26078886  0.26079088  0.26078972  0.26078728  0.26079121  0.26079452
  0.26078513  0.26078424  0.26079014  0.26079312  0.26079521  0.26078743
  0.26078558  0.26078537  0.26078674  0.26079136  0.26078632  0.26077667
  0.26079312  0.26078999  0.26079065  0.26078704  0.26078928  0.26078624
  0.26078892  0.26079202  0.26079065  0.26078689  0.26078963  0.26078749
  0.26078817  0.2607986   0.26078528  0.26078412  0.26079187  0.26079246
  0.26079226  0.26078457  0.26078099  0.26078072  0.26078376  0.26078475
  0.26078326  0.26079389  0.26079792  0.26078579  0.2607882   0.2607961
  0.26079237  0.26078218  0.26078638  0.26079753  0.2607787   0.26078618
  0.26078096  0.26078594  0.26078215  0.26079002]

有人知道背景中发生了什么会改变结果吗?(如果再次运行这些结果,这些结果不会更改)

如果网络在GPU(Titan X)上运行,差异会更大,例如,第二个输出是:

[ 0.3302682   0.33054096  0.32677746  0.32830611  0.32972822  0.32807562
  0.32850873  0.33161065  0.33009702  0.32811245  0.3285495   0.32966742
  0.33050382  0.33156893  0.3300975   0.3298254   0.33350074  0.32991216
  0.32990077  0.33203539  0.32692945  0.33036903  0.33102706  0.32648
  0.32933888  0.33161271  0.32976636  0.33252293  0.32859167  0.33013415
  0.33080408  0.33102706  0.32994759  0.33150592  0.32881773  0.33048317
  0.33040857  0.32924038  0.32986534  0.33131596  0.3282761   0.3292698
  0.32879189  0.33186096  0.32862625  0.33067161  0.329018    0.33022234
  0.32904804  0.32891914  0.33122411  0.32900628  0.33088413  0.32931429
  0.3268061   0.32924181  0.32940546  0.32860965  0.32828435  0.3310211
  0.33098024  0.32997403  0.33025959  0.33133432]
而在第一种情况下,差异仅出现在小数点后第5位和第5位:

[ 0.26075357  0.26074868  0.26074538  0.26075155  0.260755    0.26073951
  0.26074919  0.26073971  0.26074231  0.26075247  0.2607362   0.26075858
  0.26074955  0.26074123  0.26074299  0.26074946  0.26074076  0.26075014
  0.26074076  0.26075229  0.26075041  0.26074776  0.26075897  0.26073995
  0.260746    0.26074466  0.26073912  0.26075709  0.26075712  0.26073799
  0.2607322   0.26075566  0.26075059  0.26073873  0.26074558  0.26074558
  0.26074359  0.26073721  0.26074392  0.26074731  0.26074862  0.26074174
  0.26074126  0.26074588  0.26073804  0.26074919  0.26074269  0.26074606
  0.26075307  0.2607446   0.26074025  0.26074648  0.26074952  0.26073608
  0.26073566  0.26073873  0.26074576  0.26074475  0.26074636  0.26073411
  0.2607542   0.26074755  0.2607449   0.2607407 ]

这里的结果不同,因为
初始化不同

Tf使用this
init_op
进行变量初始化

sess.run(init_op)
但是Keras在其模型类中使用自己的
init_op
,而不是在代码中定义的
init_op

sess.run(init_op)