Python 二进制分类器始终返回0.5
我正在训练一个分类器,它接受RGB输入(三个0到255的值),并返回黑色或白色(0或1)字体是否最适合该颜色。在训练之后,我的分类器总是返回0.5(或大约0.5),并且永远不会得到比这更精确的结果 代码如下:Python 二进制分类器始终返回0.5,python,tensorflow,machine-learning,Python,Tensorflow,Machine Learning,我正在训练一个分类器,它接受RGB输入(三个0到255的值),并返回黑色或白色(0或1)字体是否最适合该颜色。在训练之后,我的分类器总是返回0.5(或大约0.5),并且永远不会得到比这更精确的结果 代码如下: import tensorflow as tf import numpy as np from tqdm import tqdm print('Creating Datasets:') x_train = [] y_train = [] for i in tqdm(range(100
import tensorflow as tf
import numpy as np
from tqdm import tqdm
print('Creating Datasets:')
x_train = []
y_train = []
for i in tqdm(range(10000)):
x_train.append([np.random.uniform(0, 255), np.random.uniform(0, 255), np.random.uniform(0, 255)])
for elem in tqdm(x_train):
if (((elem[0] + elem[1] + elem[2]) / 3) / 255) > 0.5:
y_train.append(0)
else:
y_train.append(1)
x_train = np.array(x_train)
y_train = np.array(y_train)
graph = tf.Graph()
with graph.as_default():
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
w_1 = tf.Variable(tf.random_normal([3, 10], stddev=1.0), tf.float32)
b_1 = tf.Variable(tf.random_normal([10]), tf.float32)
l_1 = tf.sigmoid(tf.matmul(x, w_1) + b_1)
w_2 = tf.Variable(tf.random_normal([10, 10], stddev=1.0), tf.float32)
b_2 = tf.Variable(tf.random_normal([10]), tf.float32)
l_2 = tf.sigmoid(tf.matmul(l_1, w_2) + b_2)
w_3 = tf.Variable(tf.random_normal([10, 5], stddev=1.0), tf.float32)
b_3 = tf.Variable(tf.random_normal([5]), tf.float32)
l_3 = tf.sigmoid(tf.matmul(l_2, w_3) + b_3)
w_4 = tf.Variable(tf.random_normal([5, 1], stddev=1.0), tf.float32)
b_4 = tf.Variable(tf.random_normal([1]), tf.float32)
y_ = tf.sigmoid(tf.matmul(l_3, w_4) + b_4)
loss = tf.reduce_mean(tf.squared_difference(y, y_))
optimizer = tf.train.AdadeltaOptimizer().minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print('Training:')
for step in tqdm(range(5000)):
index = np.random.randint(0, len(x_train) - 129)
feed_dict = {x : x_train[index:index+128], y : y_train[index:index+128]}
sess.run(optimizer, feed_dict=feed_dict)
if step % 1000 == 0:
print(sess.run([loss], feed_dict=feed_dict))
while True:
inp1 = int(input(''))
inp2 = int(input(''))
inp3 = int(input(''))
print(sess.run(y_, feed_dict={x : [[inp1, inp2, inp3]]}))
如您所见,我从导入将要使用的模块开始。接下来,我生成输入x数据集和所需的输出y数据集。x_序列数据集由10000个随机RGB值组成,而y_序列数据集由0和1组成,1对应于平均值低于128的RGB值,0对应于平均值高于128的RGB值(这确保明亮的背景变暗,反之亦然)
无可否认,我的神经网络过于复杂(或者我是这么认为的),但据我所知,它是一个非常标准的前馈网络,带有Adadelta优化程序和默认学习率
就我有限的知识而言,网络的训练是正常的,但尽管如此,该模型总是给出0.5
最后一段代码允许用户输入值,并查看它们在传递到神经网络时会变成什么
我已经弄乱了不同的激活函数、损耗、初始化偏差的方法等,但都没有用。有时当我修改代码时,模型总是分别返回1或0,但这仍然与犹豫不决和反复返回0.5一样不准确。我一直无法在网上找到解决问题的合适方法。欢迎任何意见或建议
编辑:
在训练过程中,损失、权重、偏差和输出变化不大(权重和偏差每1000次迭代只变化百分之一和千分之一,损失在0.3左右波动)。此外,输出有时会根据输入而变化(正如您所期望的),但其他时间是恒定的。程序的一次运行导致常量0.7作为输出,而另一次运行总是返回0.5,除了非常接近于零之外,在那里它返回0.3或0.4类型值。上述两项都不是期望的输出。应该发生的是(255, 255, 255)应该映射到0,并且(0, 0, 0)应该映射到1,并且(128, 128, 128)应该映射到1或0,因为字体颜色实际上并不重要。 从我的网络中看到的两件事:
tf.nn.relu
。输出层中的乙状体是可以的(如果你希望你的输出为0/1),但是考虑使用交叉熵代替平方误差作为损失函数。
很难说这是否能解决您的问题,但我相信这两个问题都是您必须改变网络现状的。最大的问题是您在分类问题上使用均方误差作为损失函数。交叉熵损失函数更适合这类问题 下面是交叉熵损失函数和均方误差损失函数之间差异的可视化: 资料来源: 请注意,随着模型距离正确预测的距离越来越远,损失是如何逐渐增加的(在本例中为1)。这种曲率在反向传播过程中提供了更强的梯度信号,同时也满足许多重要的理论概率分布距离(散度)特性。通过最小化交叉熵损失,实际上也在最小化模型预测分布和训练数据标签分布之间的KL差异。您可以在此处阅读有关交叉熵损失函数的更多信息: 我还调整了其他一些东西,使代码更好,使模型更易于修改。这将解决您的所有问题:
import tensorflow as tf
import numpy as np
from tqdm import tqdm
# define a random seed for (somewhat) reproducible results:
seed = 0
np.random.seed(seed)
print('Creating Datasets:')
# much faster dataset creation
x_train = np.random.uniform(low=0, high=255, size=[10000, 3])
# easier label creation
# if the average color is greater than half the color space than use black, otherwise use white
# classes:
# white = 0
# black = 1
y_train = ((np.mean(x_train, axis=1) / 255.0) > 0.5).astype(int)
# now transform dataset to be within range [-1, 1] instead of [0, 255]
# for numeric stability and quicker model training
x_train = (2 * (x_train / 255)) - 1
graph = tf.Graph()
with graph.as_default():
# must do this within graph scope
tf.set_random_seed(seed)
# specify input dims for clarity
x = tf.placeholder(tf.float32, shape=[None, 3])
# y is now integer label [0 or 1]
y = tf.placeholder(tf.int32, shape=[None])
# use relu, usually better than sigmoid
activation_fn = tf.nn.relu
# from https://arxiv.org/abs/1502.01852v1
initializer = tf.initializers.variance_scaling(
scale=2.0,
mode='fan_in',
distribution='truncated_normal')
# better api to reduce clutter
l_1 = tf.layers.dense(
x,
10,
activation=activation_fn,
kernel_initializer=initializer)
l_2 = tf.layers.dense(
l_1,
10,
activation=activation_fn,
kernel_initializer=initializer)
l_3 = tf.layers.dense(
l_2,
5,
activation=activation_fn,
kernel_initializer=initializer)
y_logits = tf.layers.dense(
l_3,
2,
activation=None,
kernel_initializer=initializer)
y_ = tf.nn.softmax(y_logits)
# much better loss function for classification
loss = tf.reduce_mean(
tf.losses.sparse_softmax_cross_entropy(
labels=y,
logits=y_logits))
# much better default optimizer for new problems
# good learning rate, but probably can tune
optimizer = tf.train.AdamOptimizer(
learning_rate=0.01)
# seperate train op for easier calling
train_op = optimizer.minimize(loss)
# tell tensorflow not to allocate all gpu memory at start
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
with tf.Session(config=config) as sess:
sess.run(tf.global_variables_initializer())
print('Training:')
for step in tqdm(range(5000)):
index = np.random.randint(0, len(x_train) - 129)
feed_dict = {x : x_train[index:index+128],
y : y_train[index:index+128]}
# can train and get loss in single run, much more efficient
_, b_loss = sess.run([train_op, loss], feed_dict=feed_dict)
if step % 1000 == 0:
print(b_loss)
while True:
inp1 = int(input('Enter R pixel color: '))
inp2 = int(input('Enter G pixel color: '))
inp3 = int(input('Enter B pixel color: '))
# scale to model train range [-1, 1]
model_input = (2 * (np.array([inp1, inp2, inp3], dtype=float) / 255.0)) - 1
if (model_input >= -1).all() and (model_input <= 1).all():
# y_ is now two probabilities (white_prob, black_prob) but they will sum to 1.
white_prob, black_prob = sess.run(y_, feed_dict={x : [model_input]})[0]
print('White prob: {:.2f} Black prob: {:.2f}'.format(white_prob, black_prob))
else:
print('Values not within [0, 255]!')
将tensorflow导入为tf
将numpy作为np导入
从TQM导入TQM
#为(某种程度上)可重复的结果定义一个随机种子:
种子=0
np.随机种子(种子)
打印('创建数据集:')
#更快的数据集创建
x_train=np.random.uniform(低=0,高=255,大小=[10000,3])
#更容易创建标签
#如果平均颜色大于颜色空间的一半,则使用黑色,否则使用白色
#课程:
#白色=0
#黑色=1
y_列=((np.平均值(x_列,轴=1)/255.0)>0.5)。aType(int)
#现在将数据集转换为在范围[-1,1]内,而不是[0,255]
#用于数值稳定性和更快的模型训练
x_列=(2*(x_列/255))-1
graph=tf.graph()
使用graph.as_default():
#必须在图形范围内执行此操作
tf.设置随机种子(种子)
#为清晰起见,请指定输入DIM
x=tf.placeholder(tf.float32,shape=[None,3])
#y现在是整数标签[0或1]
y=tf.placeholder(tf.int32,shape=[None])
#使用relu,通常优于乙状结肠
激活\u fn=tf.nn.relu
#从https://arxiv.org/abs
Creating Datasets:
2018-10-05 00:50:59.156822: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-10-05 00:50:59.411003: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1405] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:03:00.0
totalMemory: 8.00GiB freeMemory: 6.60GiB
2018-10-05 00:50:59.417736: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1484] Adding visible gpu devices: 0
2018-10-05 00:51:00.109351: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-10-05 00:51:00.113660: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0
2018-10-05 00:51:00.118545: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:984] 0: N
2018-10-05 00:51:00.121605: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6370 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1)
Training:
0%| | 0/5000 [00:00<?, ?it/s]0.6222609
19%|██████████████▋ | 940/5000 [00:01<00:14, 275.57it/s]0.013466636
39%|██████████████████████████████ | 1951/5000 [00:02<00:04, 708.07it/s]0.0067519126
59%|█████████████████████████████████████████████▊ | 2971/5000 [00:04<00:02, 733.24it/s]0.0028143923
79%|████████████████████████████████████████████████████████████▌ | 3935/5000 [00:05<00:01, 726.36it/s]0.0073514087
100%|█████████████████████████████████████████████████████████████████████████████| 5000/5000 [00:07<00:00, 698.32it/s]
Enter R pixel color: 1
Enter G pixel color: 1
Enter B pixel color: 1
White prob: 1.00 Black prob: 0.00
Enter R pixel color: 255
Enter G pixel color: 255
Enter B pixel color: 255
White prob: 0.00 Black prob: 1.00
Enter R pixel color: 128
Enter G pixel color: 128
Enter B pixel color: 128
White prob: 0.08 Black prob: 0.92
Enter R pixel color: 126
Enter G pixel color: 126
Enter B pixel color: 126
White prob: 0.99 Black prob: 0.01