Python 二进制分类器始终返回0.5

Python 二进制分类器始终返回0.5,python,tensorflow,machine-learning,Python,Tensorflow,Machine Learning,我正在训练一个分类器,它接受RGB输入(三个0到255的值),并返回黑色或白色(0或1)字体是否最适合该颜色。在训练之后,我的分类器总是返回0.5(或大约0.5),并且永远不会得到比这更精确的结果 代码如下: import tensorflow as tf import numpy as np from tqdm import tqdm print('Creating Datasets:') x_train = [] y_train = [] for i in tqdm(range(100

我正在训练一个分类器,它接受RGB输入(三个0到255的值),并返回黑色或白色(0或1)字体是否最适合该颜色。在训练之后,我的分类器总是返回0.5(或大约0.5),并且永远不会得到比这更精确的结果

代码如下:

import tensorflow as tf
import numpy as np
from tqdm import tqdm

print('Creating Datasets:')

x_train = []
y_train = []

for i in tqdm(range(10000)):
    x_train.append([np.random.uniform(0, 255), np.random.uniform(0, 255), np.random.uniform(0, 255)])

for elem in tqdm(x_train):
    if (((elem[0] + elem[1] + elem[2]) / 3) / 255) > 0.5:
        y_train.append(0)
    else:
        y_train.append(1)

x_train = np.array(x_train)
y_train = np.array(y_train)

graph = tf.Graph()

with graph.as_default():

    x = tf.placeholder(tf.float32)
    y = tf.placeholder(tf.float32)

    w_1 = tf.Variable(tf.random_normal([3, 10], stddev=1.0), tf.float32)
    b_1 = tf.Variable(tf.random_normal([10]), tf.float32)
    l_1 = tf.sigmoid(tf.matmul(x, w_1) + b_1)

    w_2 = tf.Variable(tf.random_normal([10, 10], stddev=1.0), tf.float32)
    b_2 = tf.Variable(tf.random_normal([10]), tf.float32)
    l_2 = tf.sigmoid(tf.matmul(l_1, w_2) + b_2)

    w_3 = tf.Variable(tf.random_normal([10, 5], stddev=1.0), tf.float32)
    b_3 = tf.Variable(tf.random_normal([5]), tf.float32)
    l_3 = tf.sigmoid(tf.matmul(l_2, w_3) + b_3)

    w_4 = tf.Variable(tf.random_normal([5, 1], stddev=1.0), tf.float32)
    b_4 = tf.Variable(tf.random_normal([1]), tf.float32)
    y_ = tf.sigmoid(tf.matmul(l_3, w_4) + b_4)

    loss = tf.reduce_mean(tf.squared_difference(y, y_))

    optimizer = tf.train.AdadeltaOptimizer().minimize(loss)

    with tf.Session() as sess:

        sess.run(tf.global_variables_initializer())

        print('Training:')

        for step in tqdm(range(5000)):
            index = np.random.randint(0, len(x_train) - 129)
            feed_dict = {x : x_train[index:index+128], y : y_train[index:index+128]}
            sess.run(optimizer, feed_dict=feed_dict)
            if step % 1000 == 0:
                print(sess.run([loss], feed_dict=feed_dict))

        while True:
            inp1 = int(input(''))
            inp2 = int(input(''))
            inp3 = int(input(''))
            print(sess.run(y_, feed_dict={x : [[inp1, inp2, inp3]]}))
如您所见,我从导入将要使用的模块开始。接下来,我生成输入x数据集和所需的输出y数据集。x_序列数据集由10000个随机RGB值组成,而y_序列数据集由0和1组成,1对应于平均值低于128的RGB值,0对应于平均值高于128的RGB值(这确保明亮的背景变暗,反之亦然)

无可否认,我的神经网络过于复杂(或者我是这么认为的),但据我所知,它是一个非常标准的前馈网络,带有Adadelta优化程序和默认学习率

就我有限的知识而言,网络的训练是正常的,但尽管如此,该模型总是给出0.5

最后一段代码允许用户输入值,并查看它们在传递到神经网络时会变成什么

我已经弄乱了不同的激活函数、损耗、初始化偏差的方法等,但都没有用。有时当我修改代码时,模型总是分别返回1或0,但这仍然与犹豫不决和反复返回0.5一样不准确。我一直无法在网上找到解决问题的合适方法。欢迎任何意见或建议

编辑:


在训练过程中,损失、权重、偏差和输出变化不大(权重和偏差每1000次迭代只变化百分之一和千分之一,损失在0.3左右波动)。此外,输出有时会根据输入而变化(正如您所期望的),但其他时间是恒定的。程序的一次运行导致常量0.7作为输出,而另一次运行总是返回0.5,除了非常接近于零之外,在那里它返回0.3或0.4类型值。上述两项都不是期望的输出。应该发生的是(255, 255, 255)应该映射到0,并且(0, 0, 0)应该映射到1,并且(128, 128, 128)应该映射到1或0,因为字体颜色实际上并不重要。

从我的网络中看到的两件事:

  • 隐藏层中的乙状结肠激活通常是一个不好的选择。sigmoid函数对大(正或负)输入饱和,导致梯度在通过网络反向传播时变得越来越小。这通常被称为“消失梯度”问题。可能是输出附近变量的梯度是“健康的”,因此上层在学习,但是如果下层没有接收到任何梯度,他们将继续返回上层无法处理的随机值。您可以尝试将乙状结肠激活替换为例如
    tf.nn.relu
    。输出层中的乙状体是可以的(如果你希望你的输出为0/1),但是考虑使用交叉熵代替平方误差作为损失函数。
  • 权重初始化可能会导致权重过大。1.0的标准偏差太高了。这可能会导致数值问题,并使激活更加饱和(因为由于权重较大,您可以期望从一开始就有较大的激活值)。尝试一些类似于0.1的STD,并考虑使用<代码> TunCeDead Salm < /代码>来防止异常值(或使用统一的随机初始化)。

  • 很难说这是否能解决您的问题,但我相信这两个问题都是您必须改变网络现状的。最大的问题是您在分类问题上使用均方误差作为损失函数。交叉熵损失函数更适合这类问题

    下面是交叉熵损失函数和均方误差损失函数之间差异的可视化:

    资料来源:

    请注意,随着模型距离正确预测的距离越来越远,损失是如何逐渐增加的(在本例中为1)。这种曲率在反向传播过程中提供了更强的梯度信号,同时也满足许多重要的理论概率分布距离(散度)特性。通过最小化交叉熵损失,实际上也在最小化模型预测分布和训练数据标签分布之间的KL差异。您可以在此处阅读有关交叉熵损失函数的更多信息:

    我还调整了其他一些东西,使代码更好,使模型更易于修改。这将解决您的所有问题:

    import tensorflow as tf
    import numpy as np
    from tqdm import tqdm
    
    # define a random seed for (somewhat) reproducible results:
    seed = 0
    np.random.seed(seed)
    print('Creating Datasets:')
    
    # much faster dataset creation
    x_train = np.random.uniform(low=0, high=255, size=[10000, 3])
    # easier label creation
    # if the average color is greater than half the color space than use black, otherwise use white
    # classes:
    # white = 0
    # black = 1
    y_train = ((np.mean(x_train, axis=1) / 255.0) > 0.5).astype(int)
    
    # now transform dataset to be within range [-1, 1] instead of [0, 255] 
    # for numeric stability and quicker model training
    x_train = (2 * (x_train / 255)) - 1
    
    graph = tf.Graph()
    
    with graph.as_default():
        # must do this within graph scope
        tf.set_random_seed(seed)
        # specify input dims for clarity
        x = tf.placeholder(tf.float32, shape=[None, 3])
        # y is now integer label [0 or 1]
        y = tf.placeholder(tf.int32, shape=[None])
        # use relu, usually better than sigmoid 
        activation_fn = tf.nn.relu
        # from https://arxiv.org/abs/1502.01852v1
        initializer = tf.initializers.variance_scaling(
            scale=2.0, 
            mode='fan_in',
            distribution='truncated_normal')
        # better api to reduce clutter
        l_1 = tf.layers.dense(
            x,
            10,
            activation=activation_fn,
            kernel_initializer=initializer)
        l_2 = tf.layers.dense(
            l_1,
            10,
            activation=activation_fn,
            kernel_initializer=initializer)
        l_3 = tf.layers.dense(
            l_2,
            5,
            activation=activation_fn,
            kernel_initializer=initializer)
        y_logits = tf.layers.dense(
            l_3,
            2,
            activation=None,
            kernel_initializer=initializer)
    
        y_ = tf.nn.softmax(y_logits)
        # much better loss function for classification
        loss = tf.reduce_mean(
            tf.losses.sparse_softmax_cross_entropy(
                labels=y, 
                logits=y_logits))
        # much better default optimizer for new problems
        # good learning rate, but probably can tune
        optimizer = tf.train.AdamOptimizer(
            learning_rate=0.01)
        # seperate train op for easier calling
        train_op = optimizer.minimize(loss)
    
        # tell tensorflow not to allocate all gpu memory at start
        config = tf.ConfigProto()
        config.gpu_options.allow_growth=True
        with tf.Session(config=config) as sess:
    
            sess.run(tf.global_variables_initializer())
    
            print('Training:')
    
            for step in tqdm(range(5000)):
                index = np.random.randint(0, len(x_train) - 129)
                feed_dict = {x : x_train[index:index+128], 
                             y : y_train[index:index+128]}
                # can train and get loss in single run, much more efficient
                _, b_loss = sess.run([train_op, loss], feed_dict=feed_dict)
                if step % 1000 == 0:
                    print(b_loss)
    
            while True:
                inp1 = int(input('Enter R pixel color: '))
                inp2 = int(input('Enter G pixel color: '))
                inp3 = int(input('Enter B pixel color: '))
                # scale to model train range [-1, 1]
                model_input = (2 * (np.array([inp1, inp2, inp3], dtype=float) / 255.0)) - 1
                if (model_input >= -1).all() and (model_input <= 1).all():
                    # y_ is now two probabilities (white_prob, black_prob) but they will sum to 1.
                    white_prob, black_prob = sess.run(y_, feed_dict={x : [model_input]})[0]
                    print('White prob: {:.2f} Black prob: {:.2f}'.format(white_prob, black_prob))
                else:
                    print('Values not within [0, 255]!')
    
    将tensorflow导入为tf
    将numpy作为np导入
    从TQM导入TQM
    #为(某种程度上)可重复的结果定义一个随机种子:
    种子=0
    np.随机种子(种子)
    打印('创建数据集:')
    #更快的数据集创建
    x_train=np.random.uniform(低=0,高=255,大小=[10000,3])
    #更容易创建标签
    #如果平均颜色大于颜色空间的一半,则使用黑色,否则使用白色
    #课程:
    #白色=0
    #黑色=1
    y_列=((np.平均值(x_列,轴=1)/255.0)>0.5)。aType(int)
    #现在将数据集转换为在范围[-1,1]内,而不是[0,255]
    #用于数值稳定性和更快的模型训练
    x_列=(2*(x_列/255))-1
    graph=tf.graph()
    使用graph.as_default():
    #必须在图形范围内执行此操作
    tf.设置随机种子(种子)
    #为清晰起见,请指定输入DIM
    x=tf.placeholder(tf.float32,shape=[None,3])
    #y现在是整数标签[0或1]
    y=tf.placeholder(tf.int32,shape=[None])
    #使用relu,通常优于乙状结肠
    激活\u fn=tf.nn.relu
    #从https://arxiv.org/abs
    
    Creating Datasets:
    2018-10-05 00:50:59.156822: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
    2018-10-05 00:50:59.411003: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1405] Found device 0 with properties:
    name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
    pciBusID: 0000:03:00.0
    totalMemory: 8.00GiB freeMemory: 6.60GiB
    2018-10-05 00:50:59.417736: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1484] Adding visible gpu devices: 0
    2018-10-05 00:51:00.109351: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
    2018-10-05 00:51:00.113660: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971]      0
    2018-10-05 00:51:00.118545: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:984] 0:   N
    2018-10-05 00:51:00.121605: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6370 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1)
    Training:
      0%|                                                                                         | 0/5000 [00:00<?, ?it/s]0.6222609
     19%|██████████████▋                                                               | 940/5000 [00:01<00:14, 275.57it/s]0.013466636
     39%|██████████████████████████████                                               | 1951/5000 [00:02<00:04, 708.07it/s]0.0067519126
     59%|█████████████████████████████████████████████▊                               | 2971/5000 [00:04<00:02, 733.24it/s]0.0028143923
     79%|████████████████████████████████████████████████████████████▌                | 3935/5000 [00:05<00:01, 726.36it/s]0.0073514087
    100%|█████████████████████████████████████████████████████████████████████████████| 5000/5000 [00:07<00:00, 698.32it/s]
    Enter R pixel color: 1
    Enter G pixel color: 1
    Enter B pixel color: 1
    White prob: 1.00 Black prob: 0.00
    Enter R pixel color: 255
    Enter G pixel color: 255
    Enter B pixel color: 255
    White prob: 0.00 Black prob: 1.00
    Enter R pixel color: 128
    Enter G pixel color: 128
    Enter B pixel color: 128
    White prob: 0.08 Black prob: 0.92
    Enter R pixel color: 126
    Enter G pixel color: 126
    Enter B pixel color: 126
    White prob: 0.99 Black prob: 0.01