Python 二进制分类器始终返回0.5_Python_Tensorflow_Machine Learning

Python 二进制分类器始终返回0.5

python tensorflow machine-learning

Python 二进制分类器始终返回0.5,python,tensorflow,machine-learning,Python,Tensorflow,Machine Learning,我正在训练一个分类器，它接受RGB输入（三个0到255的值），并返回黑色或白色（0或1）字体是否最适合该颜色。在训练之后，我的分类器总是返回0.5（或大约0.5），并且永远不会得到比这更精确的结果代码如下： import tensorflow as tf import numpy as np from tqdm import tqdm print('Creating Datasets:') x_train = [] y_train = [] for i in tqdm(range(100

我正在训练一个分类器，它接受RGB输入（三个0到255的值），并返回黑色或白色（0或1）字体是否最适合该颜色。在训练之后，我的分类器总是返回0.5（或大约0.5），并且永远不会得到比这更精确的结果

代码如下：

import tensorflow as tf
import numpy as np
from tqdm import tqdm

print('Creating Datasets:')

x_train = []
y_train = []

for i in tqdm(range(10000)):
    x_train.append([np.random.uniform(0, 255), np.random.uniform(0, 255), np.random.uniform(0, 255)])

for elem in tqdm(x_train):
    if (((elem[0] + elem[1] + elem[2]) / 3) / 255) > 0.5:
        y_train.append(0)
    else:
        y_train.append(1)

x_train = np.array(x_train)
y_train = np.array(y_train)

graph = tf.Graph()

with graph.as_default():

    x = tf.placeholder(tf.float32)
    y = tf.placeholder(tf.float32)

    w_1 = tf.Variable(tf.random_normal([3, 10], stddev=1.0), tf.float32)
    b_1 = tf.Variable(tf.random_normal([10]), tf.float32)
    l_1 = tf.sigmoid(tf.matmul(x, w_1) + b_1)

    w_2 = tf.Variable(tf.random_normal([10, 10], stddev=1.0), tf.float32)
    b_2 = tf.Variable(tf.random_normal([10]), tf.float32)
    l_2 = tf.sigmoid(tf.matmul(l_1, w_2) + b_2)

    w_3 = tf.Variable(tf.random_normal([10, 5], stddev=1.0), tf.float32)
    b_3 = tf.Variable(tf.random_normal([5]), tf.float32)
    l_3 = tf.sigmoid(tf.matmul(l_2, w_3) + b_3)

    w_4 = tf.Variable(tf.random_normal([5, 1], stddev=1.0), tf.float32)
    b_4 = tf.Variable(tf.random_normal([1]), tf.float32)
    y_ = tf.sigmoid(tf.matmul(l_3, w_4) + b_4)

    loss = tf.reduce_mean(tf.squared_difference(y, y_))

    optimizer = tf.train.AdadeltaOptimizer().minimize(loss)

    with tf.Session() as sess:

        sess.run(tf.global_variables_initializer())

        print('Training:')

        for step in tqdm(range(5000)):
            index = np.random.randint(0, len(x_train) - 129)
            feed_dict = {x : x_train[index:index+128], y : y_train[index:index+128]}
            sess.run(optimizer, feed_dict=feed_dict)
            if step % 1000 == 0:
                print(sess.run([loss], feed_dict=feed_dict))

        while True:
            inp1 = int(input(''))
            inp2 = int(input(''))
            inp3 = int(input(''))
            print(sess.run(y_, feed_dict={x : [[inp1, inp2, inp3]]}))

如您所见，我从导入将要使用的模块开始。接下来，我生成输入x数据集和所需的输出y数据集。x_序列数据集由10000个随机RGB值组成，而y_序列数据集由0和1组成，1对应于平均值低于128的RGB值，0对应于平均值高于128的RGB值（这确保明亮的背景变暗，反之亦然）

无可否认，我的神经网络过于复杂（或者我是这么认为的），但据我所知，它是一个非常标准的前馈网络，带有Adadelta优化程序和默认学习率

就我有限的知识而言，网络的训练是正常的，但尽管如此，该模型总是给出0.5

最后一段代码允许用户输入值，并查看它们在传递到神经网络时会变成什么

我已经弄乱了不同的激活函数、损耗、初始化偏差的方法等，但都没有用。有时当我修改代码时，模型总是分别返回1或0，但这仍然与犹豫不决和反复返回0.5一样不准确。我一直无法在网上找到解决问题的合适方法。欢迎任何意见或建议

编辑：

在训练过程中，损失、权重、偏差和输出变化不大（权重和偏差每1000次迭代只变化百分之一和千分之一，损失在0.3左右波动）。此外，输出有时会根据输入而变化（正如您所期望的），但其他时间是恒定的。程序的一次运行导致常量0.7作为输出，而另一次运行总是返回0.5，除了非常接近于零之外，在那里它返回0.3或0.4类型值。上述两项都不是期望的输出。应该发生的是（255, 255, 255）应该映射到0，并且（0, 0, 0）应该映射到1，并且（128, 128, 128）应该映射到1或0，因为字体颜色实际上并不重要。

从我的网络中看到的两件事：

隐藏层中的乙状结肠激活通常是一个不好的选择。sigmoid函数对大（正或负）输入饱和，导致梯度在通过网络反向传播时变得越来越小。这通常被称为“消失梯度”问题。可能是输出附近变量的梯度是“健康的”，因此上层在学习，但是如果下层没有接收到任何梯度，他们将继续返回上层无法处理的随机值。您可以尝试将乙状结肠激活替换为例如

tf.nn.relu

。输出层中的乙状体是可以的（如果你希望你的输出为0/1），但是考虑使用交叉熵代替平方误差作为损失函数。

权重初始化可能会导致权重过大。1.0的标准偏差太高了。这可能会导致数值问题，并使激活更加饱和（因为由于权重较大，您可以期望从一开始就有较大的激活值）。尝试一些类似于0.1的STD，并考虑使用<代码> TunCeDead Salm < /代码>来防止异常值（或使用统一的随机初始化）。

很难说这是否能解决您的问题，但我相信这两个问题都是您必须改变网络现状的。最大的问题是您在分类问题上使用均方误差作为损失函数。交叉熵损失函数更适合这类问题

下面是交叉熵损失函数和均方误差损失函数之间差异的可视化：

资料来源：

请注意，随着模型距离正确预测的距离越来越远，损失是如何逐渐增加的（在本例中为1）。这种曲率在反向传播过程中提供了更强的梯度信号，同时也满足许多重要的理论概率分布距离（散度）特性。通过最小化交叉熵损失，实际上也在最小化模型预测分布和训练数据标签分布之间的KL差异。您可以在此处阅读有关交叉熵损失函数的更多信息：

我还调整了其他一些东西，使代码更好，使模型更易于修改。这将解决您的所有问题：

import tensorflow as tf
import numpy as np
from tqdm import tqdm

# define a random seed for (somewhat) reproducible results:
seed = 0
np.random.seed(seed)
print('Creating Datasets:')

# much faster dataset creation
x_train = np.random.uniform(low=0, high=255, size=[10000, 3])
# easier label creation
# if the average color is greater than half the color space than use black, otherwise use white
# classes:
# white = 0
# black = 1
y_train = ((np.mean(x_train, axis=1) / 255.0) > 0.5).astype(int)

# now transform dataset to be within range [-1, 1] instead of [0, 255] 
# for numeric stability and quicker model training
x_train = (2 * (x_train / 255)) - 1

graph = tf.Graph()

with graph.as_default():
    # must do this within graph scope
    tf.set_random_seed(seed)
    # specify input dims for clarity
    x = tf.placeholder(tf.float32, shape=[None, 3])
    # y is now integer label [0 or 1]
    y = tf.placeholder(tf.int32, shape=[None])
    # use relu, usually better than sigmoid 
    activation_fn = tf.nn.relu
    # from https://arxiv.org/abs/1502.01852v1
    initializer = tf.initializers.variance_scaling(
        scale=2.0, 
        mode='fan_in',
        distribution='truncated_normal')
    # better api to reduce clutter
    l_1 = tf.layers.dense(
        x,
        10,
        activation=activation_fn,
        kernel_initializer=initializer)
    l_2 = tf.layers.dense(
        l_1,
        10,
        activation=activation_fn,
        kernel_initializer=initializer)
    l_3 = tf.layers.dense(
        l_2,
        5,
        activation=activation_fn,
        kernel_initializer=initializer)
    y_logits = tf.layers.dense(
        l_3,
        2,
        activation=None,
        kernel_initializer=initializer)

    y_ = tf.nn.softmax(y_logits)
    # much better loss function for classification
    loss = tf.reduce_mean(
        tf.losses.sparse_softmax_cross_entropy(
            labels=y, 
            logits=y_logits))
    # much better default optimizer for new problems
    # good learning rate, but probably can tune
    optimizer = tf.train.AdamOptimizer(
        learning_rate=0.01)
    # seperate train op for easier calling
    train_op = optimizer.minimize(loss)

    # tell tensorflow not to allocate all gpu memory at start
    config = tf.ConfigProto()
    config.gpu_options.allow_growth=True
    with tf.Session(config=config) as sess:

        sess.run(tf.global_variables_initializer())

        print('Training:')

        for step in tqdm(range(5000)):
            index = np.random.randint(0, len(x_train) - 129)
            feed_dict = {x : x_train[index:index+128], 
                         y : y_train[index:index+128]}
            # can train and get loss in single run, much more efficient
            _, b_loss = sess.run([train_op, loss], feed_dict=feed_dict)
            if step % 1000 == 0:
                print(b_loss)

        while True:
            inp1 = int(input('Enter R pixel color: '))
            inp2 = int(input('Enter G pixel color: '))
            inp3 = int(input('Enter B pixel color: '))
            # scale to model train range [-1, 1]
            model_input = (2 * (np.array([inp1, inp2, inp3], dtype=float) / 255.0)) - 1
            if (model_input >= -1).all() and (model_input <= 1).all():
                # y_ is now two probabilities (white_prob, black_prob) but they will sum to 1.
                white_prob, black_prob = sess.run(y_, feed_dict={x : [model_input]})[0]
                print('White prob: {:.2f} Black prob: {:.2f}'.format(white_prob, black_prob))
            else:
                print('Values not within [0, 255]!')

将tensorflow导入为tf
将numpy作为np导入
从TQM导入TQM
#为（某种程度上）可重复的结果定义一个随机种子：
种子=0
np.随机种子（种子）
打印（'创建数据集：'）
#更快的数据集创建
x_train=np.random.uniform（低=0，高=255，大小=[10000,3]）
#更容易创建标签
#如果平均颜色大于颜色空间的一半，则使用黑色，否则使用白色
#课程：
#白色=0
#黑色=1
y_列=（（np.平均值（x_列，轴=1）/255.0）>0.5）。aType（int）
#现在将数据集转换为在范围[-1,1]内，而不是[0,255]
#用于数值稳定性和更快的模型训练
x_列=（2*（x_列/255））-1
graph=tf.graph（）
使用graph.as_default（）：
#必须在图形范围内执行此操作
tf.设置随机种子（种子）
#为清晰起见，请指定输入DIM
x=tf.placeholder（tf.float32，shape=[None，3]）
#y现在是整数标签[0或1]
y=tf.placeholder（tf.int32，shape=[None]）
#使用relu，通常优于乙状结肠
激活\u fn=tf.nn.relu
#从https://arxiv.org/abs
Creating Datasets:
2018-10-05 00:50:59.156822: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-10-05 00:50:59.411003: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1405] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:03:00.0
totalMemory: 8.00GiB freeMemory: 6.60GiB
2018-10-05 00:50:59.417736: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1484] Adding visible gpu devices: 0
2018-10-05 00:51:00.109351: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-05 00:51:00.113660: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971]      0
2018-10-05 00:51:00.118545: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:984] 0:   N
2018-10-05 00:51:00.121605: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6370 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1)
Training:
  0%|                                                                                         | 0/5000 [00:00<?, ?it/s]0.6222609
 19%|██████████████▋                                                               | 940/5000 [00:01<00:14, 275.57it/s]0.013466636
 39%|██████████████████████████████                                               | 1951/5000 [00:02<00:04, 708.07it/s]0.0067519126
 59%|█████████████████████████████████████████████▊                               | 2971/5000 [00:04<00:02, 733.24it/s]0.0028143923
 79%|████████████████████████████████████████████████████████████▌                | 3935/5000 [00:05<00:01, 726.36it/s]0.0073514087
100%|█████████████████████████████████████████████████████████████████████████████| 5000/5000 [00:07<00:00, 698.32it/s]
Enter R pixel color: 1
Enter G pixel color: 1
Enter B pixel color: 1
White prob: 1.00 Black prob: 0.00
Enter R pixel color: 255
Enter G pixel color: 255
Enter B pixel color: 255
White prob: 0.00 Black prob: 1.00
Enter R pixel color: 128
Enter G pixel color: 128
Enter B pixel color: 128
White prob: 0.08 Black prob: 0.92
Enter R pixel color: 126
Enter G pixel color: 126
Enter B pixel color: 126
White prob: 0.99 Black prob: 0.01