Python 在tensorflow模型中关闭softmax_Python_Tensorflow_Keras_Predict_Transfer Learning

Python 在tensorflow模型中关闭softmax

python tensorflow keras

Python 在tensorflow模型中关闭softmax,python,tensorflow,keras,predict,transfer-learning,Python,Tensorflow,Keras,Predict,Transfer Learning,我只想下载tensorflow的一个内置模型（通过keras），关闭输出层的softmax（即用线性激活功能替换），这样我的输出功能就是应用softmax之前输出层的激活因此，我将VGG16作为一个模型，并将其称为base_模型 from tensorflow.keras.applications.vgg16 import VGG16 from tensorflow.keras.applications.vgg16 import preprocess_input base_model = VG

我只想下载tensorflow的一个内置模型（通过keras），关闭输出层的softmax（即用线性激活功能替换），这样我的输出功能就是应用softmax之前输出层的激活

因此，我将VGG16作为一个模型，并将其称为base_模型

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
base_model = VGG16()

我看了最后一层，如下所示：

base_model.get_layer('predictions').get_config()

并获得：

{'name': 'predictions',
 'trainable': True,
 'dtype': 'float32',
 'units': 1000,
 'activation': 'softmax',
 'use_bias': True,
 'kernel_initializer': {'class_name': 'GlorotUniform',
  'config': {'seed': None, 'dtype': 'float32'}},
 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}},
 'kernel_regularizer': None,
 'bias_regularizer': None,
 'activity_regularizer': None,
 'kernel_constraint': None,
 'bias_constraint': None}

然后，我这样做是为了切换激活功能：

base_model.get_layer('predictions').activation=tf.compat.v1.keras.activations.linear

看起来它的工作原理如下：

base_model.get_layer('predictions').get_config()

给出：

{'name': 'predictions',
 'trainable': True,
 'dtype': 'float32',
 'units': 1000,
 'activation': 'linear',
 'use_bias': True,
 'kernel_initializer': {'class_name': 'GlorotUniform',
  'config': {'seed': None, 'dtype': 'float32'}},
 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}},
 'kernel_regularizer': None,
 'bias_regularizer': None,
 'activity_regularizer': None,
 'kernel_constraint': None,
 'bias_constraint': None}.

但当我在图片中使用：

filename = 'test_data/ILSVRC2012_val_00001218.JPEG'
img = image.load_img(filename, target_size=(224, 224)) # loads image
x = image.img_to_array(img) # convets to a numpy array
x = np.expand_dims(x, axis=0) # batches images
x = preprocess_input(x) # prepare the image for the VGG model

我对它做了一个预测，以获得我的特征：

features = base_model.predict(x)

该功能的总和仍然为1，即它们看起来像是被softmax标准化为1

sum(features[0])

是1.0000000321741935，这与我在该层上使用softmax激活功能时得到的数字完全相同

我还尝试复制配置字典中的'linear'，并在输出层上使用set_config

在tensorflow中关闭softmax似乎很难做到：在caffe中，只需更改部署文件中的一行即可切换预先训练模型的激活函数，因此我真的不明白为什么在tensorflow中如此困难。我在将代码从caffe切换到tensorflow之后，因为我认为使用tf获取预先训练的模型会更容易，但这个问题让我重新考虑

我想我可以试着撕下预测层，用一个具有相同设置的全新层替换它（并加入旧的权重），但我相信一定有办法编辑预测层

目前我正在使用TensorFlow 1.14.0，我计划升级到2.0，但我认为使用TensorFlow 1不是问题所在

有人能告诉我如何关闭softmax吗？这应该是一件简单的事情，我花了几个小时在上面，甚至加入了stack overflow，只是为了解决这个问题

提前感谢您的帮助

不幸的是，Keras不是围绕“复杂”的事情设计的，比如对现有网络进行特定修改。我相信在激活之前获得输出是可能的，但这涉及到遍历op图，而且并不十分简单。我曾试图这样做，但发现太难了，于是用另一种方式解决了我的问题

如果您正在制作自己的模型，您可以将激活设置为一个单独的层，然后您可以随意弹出该层。但是，由于您使用的是预制模型，因此无法执行此操作

根据您的具体情况，我可以看到您有两个选择：

如果你想要一个不完美但可能足够好的快速黑客解决方案，你可以简单地计算它会是什么。Softmax是一个定义良好的方程，因此您可以简单地建立一个逆方程，然后将其应用于Softmax输出。这不会得到确切的输出，但在许多情况下应该足够接近

如果您想要一个稳定、可维护的解决方案，只需在不激活的情况下创建一个新层并复制权重即可。我同意这样做感觉很奇怪，但其实没那么难，我想不出任何合理的理由不这样做

如上所述，您始终可以反转softmax操作，该操作应该是直接进行的。但是，如果您仍要更改激活，则必须将权重复制到新层

import tensorflow as tf

model = tf.keras.applications.ResNet50()
assert model.layers[-1].activation == tf.keras.activations.softmax

config = model.layers[-1].get_config()
weights = [x.numpy() for x in model.layers[-1].weights]

config['activation'] = tf.keras.activations.linear
config['name'] = 'logits'

new_layer = tf.keras.layers.Dense(**config)(model.layers[-2].output)
new_model = tf.keras.Model(inputs=[model.input], outputs=[new_layer])
new_model.layers[-1].set_weights(weights)

assert new_model.layers[-1].activation == tf.keras.activations.linear

非常感谢Srihari！这段代码用于完成我需要的任务。对于任何想在tensorflow 1.14中实现这一点的人来说，我必须使用“急切执行”，因此我在文件的顶部添加了这个命令：“tf.enable\u eager\u execution（）tfe=tf.contrib.eager``但是我认为Srihari的代码在tensorflow 2中可以工作，你真的需要将权重复制到一个新层吗？为什么？