Python 为什么我的再培训模型的准确性较差？_Python_Tensorflow_Machine Learning

Python 为什么我的再培训模型的准确性较差？

python tensorflow machine-learning

Python 为什么我的再培训模型的准确性较差？,python,tensorflow,machine-learning,Python,Tensorflow,Machine Learning,我试图使用相同的数据集（MNIST手写数字数据集）重新训练预训练模型的最后一层，但重新训练模型的精度比初始模型差得多。我的初始模型精度约为98%，而重新训练的模型精度根据运行情况在40-80%之间变化。当我根本不训练前两层时，我会得到类似的结果这是我想做的事情的一个形象以及守则： import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data epochs1 = 150 epochs2 =

我试图使用相同的数据集（MNIST手写数字数据集）重新训练预训练模型的最后一层，但重新训练模型的精度比初始模型差得多。我的初始模型精度约为98%，而重新训练的模型精度根据运行情况在40-80%之间变化。当我根本不训练前两层时，我会得到类似的结果

这是我想做的事情的一个形象

以及守则：

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

epochs1 = 150
epochs2 = 300
batch_size = 11000
learning_rate1 = 1e-3
learning_rate2 = 1e-4

# Base model
def base_model(input, reuse=False):
    with tf.variable_scope('base_model', reuse=reuse):
        layer1 = tf.contrib.layers.fully_connected(input, 300)
        features = tf.contrib.layers.fully_connected(layer1, 300)
        return features


mnist = input_data.read_data_sets('./mnist/', one_hot=True)

image = tf.placeholder(tf.float32, [None, 784])
label = tf.placeholder(tf.float32, [None, 10])

features1 = base_model(image, reuse=False)
features2 = base_model(image, reuse=True)

# Logits1 trained with the base model
with tf.variable_scope('logits1', reuse=False):
    logits1 = tf.contrib.layers.fully_connected(features1, 10, tf.nn.relu)

# Logits2 trained while the base model is frozen
with tf.variable_scope('logits2', reuse=False):
    logits2 = tf.contrib.layers.fully_connected(features2, 10, tf.nn.relu)

# Var Lists
var_list_partial1 = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='logits1')
var_list_partial2 = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='base_model')
var_list1 = var_list_partial1 + var_list_partial2
var_list2 = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='logits2')

# Sanity check
print("var_list1:", var_list1)
print("var_list2:", var_list2)

# Cross Entropy Losses
loss1 = tf.nn.softmax_cross_entropy_with_logits(logits=logits1, labels=label)
loss2 = tf.nn.softmax_cross_entropy_with_logits(logits=logits2, labels=label)

# Train the final logits layer
train1 = tf.train.AdamOptimizer(learning_rate1).minimize(loss1, var_list=var_list1)
train2 = tf.train.AdamOptimizer(learning_rate2).minimize(loss2, var_list=var_list2)

# Accuracy operations
correct_prediction1 = tf.equal(tf.argmax(logits1, 1), tf.argmax(label, 1))
correct_prediction2 = tf.equal(tf.argmax(logits2, 1), tf.argmax(label, 1))
accuracy1 = tf.reduce_mean(tf.cast(correct_prediction1, "float"))
accuracy2 = tf.reduce_mean(tf.cast(correct_prediction2, "float"))

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    batches = int(len(mnist.train.images) / batch_size)

    # Train base model and logits1
    for epoch in range(epochs1):
        for batch in range(batches):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            sess.run(train1, feed_dict={image: batch_xs, label: batch_ys})

    # Train logits2 keeping the base model frozen
    for epoch in range(epochs2):
        for batch in range(batches):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            sess.run(train2, feed_dict={image: batch_xs, label: batch_ys})

    # Print the both models after training
    accuracy = sess.run(accuracy1, feed_dict={image: mnist.test.images, label: mnist.test.labels})
    print("Initial Model Accuracy After training final model:", accuracy)
    accuracy = sess.run(accuracy2, feed_dict={image: mnist.test.images, label: mnist.test.labels})
    print("Final Model Accuracy After Training:", accuracy)

提前谢谢

尝试从“logits1”和“logits2”中删除非线性

我将您的代码更改为：

# Logits1 trained with the base model
with tf.variable_scope('logits1', reuse=False):
    #logits1 = tf.contrib.layers.fully_connected(features1, 10, tf.nn.relu)
    logits1 = tf.contrib.layers.fully_connected(features1, 10, None)

# Logits2 trained while the base model is frozen
with tf.variable_scope('logits2', reuse=False):
     #logits2 = tf.contrib.layers.fully_connected(features2, 10, tf.nn.relu)
     logits2 = tf.contrib.layers.fully_connected(features2, 10, None)

结果改为：

Initial Model Accuracy After training final model: 0.9805
Final Model Accuracy After Training: 0.9658

p.S.和300+300神经元对于MNIST分类器来说太多了，但我认为你的观点不是要对MNIST进行分类：）

我不是100%确定你所说的非线性是什么意思。你能举个例子或详细说明一下吗？我编辑了我的帖子来展示我是如何试图删除它的，但我不确定这是否是你的意思。你也是对的，这只是一个玩具的例子，在我升级之前我正在测试。它起作用了，但是你怎么知道要这么做呢？这个解决方案背后的理论是什么？理论很简单：你不应该在softmax之前就使用ReLU，：）我发布了你的代码好几次，第一个模型从一次发射到另一次发射都经过了0.7-0.98精度的训练。这是非常大的范围，然后第一个模型是不稳健的。我查看了模型，发现softmax之前存在非线性。