Python 3.x 基于张量流的线性回归神经网络_Python 3.x_Tensorflow_Neural Network_Linear Regression

Python 3.x 基于张量流的线性回归神经网络

python-3.x tensorflow neural-network

Python 3.x 基于张量流的线性回归神经网络,python-3.x,tensorflow,neural-network,linear-regression,Python 3.x,Tensorflow,Neural Network,Linear Regression,我刚开始学习tensorflow，正在为线性回归实现一个神经网络。我正在学习一些在线教程，以便能够编写代码。我使用的不是激活函数，而是MSEtf.reduce_sumtf.squareoutput_layer-y。当我运行代码时，我得到Nan作为预测精度。下面给出了我使用的代码 # Placeholders X = tf.placeholder("float", shape=[None, x_size]) y = tf.placeholder("float") w_1 = tf.Variabl

我刚开始学习tensorflow，正在为线性回归实现一个神经网络。我正在学习一些在线教程，以便能够编写代码。我使用的不是激活函数，而是MSEtf.reduce_sumtf.squareoutput_layer-y。当我运行代码时，我得到Nan作为预测精度。下面给出了我使用的代码

# Placeholders
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float")

w_1 = tf.Variable(tf.random_normal([x_size, 1], seed=seed))

output_layer = tf.matmul(X, w_1)
predict = output_layer

cost = tf.reduce_sum(tf.square(output_layer - y))
optimizer =  tf.train.GradientDescentOptimizer(0.0001).minimize(cost)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)


for epoch in range(100):
        # Train with each example
        for i in range(len(train_X)):
            sess.run(optimizer, feed_dict={X: train_X[i: i + 1], y: train_y[i: i + 1]})

            train_accuracy = np.mean(sess.run(predict, feed_dict={X: train_X, y: train_y}))
            test_accuracy  = np.mean(sess.run(predict, feed_dict={X: test_X, y: test_y}))

            print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
            % (epoch + 1, 100. * train_accuracy, 100. * test_accuracy))


# In[121]:

sess.close()

Epoch = 1, train accuracy = -2643642714558682640372224491520000.000000%, test accuracy = -2683751730046365038353121175142400.000000%
Epoch = 1, train accuracy = 161895895004931631079134808611225600.000000%, test accuracy = 165095877160981392686228427295948800.000000%
Epoch = 1, train accuracy = -18669546053716288450687958380235980800.000000%, test accuracy = -19281734142647757560839513130087219200.000000%
Epoch = 1, train accuracy = inf%, test accuracy = inf%
Epoch = 1, train accuracy = nan%, test accuracy = nan%

下面给出了一个示例输出

# Placeholders
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float")

w_1 = tf.Variable(tf.random_normal([x_size, 1], seed=seed))

output_layer = tf.matmul(X, w_1)
predict = output_layer

cost = tf.reduce_sum(tf.square(output_layer - y))
optimizer =  tf.train.GradientDescentOptimizer(0.0001).minimize(cost)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)


for epoch in range(100):
        # Train with each example
        for i in range(len(train_X)):
            sess.run(optimizer, feed_dict={X: train_X[i: i + 1], y: train_y[i: i + 1]})

            train_accuracy = np.mean(sess.run(predict, feed_dict={X: train_X, y: train_y}))
            test_accuracy  = np.mean(sess.run(predict, feed_dict={X: test_X, y: test_y}))

            print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
            % (epoch + 1, 100. * train_accuracy, 100. * test_accuracy))


# In[121]:

sess.close()

Epoch = 1, train accuracy = -2643642714558682640372224491520000.000000%, test accuracy = -2683751730046365038353121175142400.000000%
Epoch = 1, train accuracy = 161895895004931631079134808611225600.000000%, test accuracy = 165095877160981392686228427295948800.000000%
Epoch = 1, train accuracy = -18669546053716288450687958380235980800.000000%, test accuracy = -19281734142647757560839513130087219200.000000%
Epoch = 1, train accuracy = inf%, test accuracy = inf%
Epoch = 1, train accuracy = nan%, test accuracy = nan%

感谢您的帮助。另外，如果你能提供调试技巧，那将是非常棒的

谢谢

注: 当我运行单个批次时，预测值变得太大

sess.run(optimizer, feed_dict={X: train_X[0:1], y: train_y[0:1]})
sess.run(optimizer, feed_dict={X: train_X[1:2], y: train_y[1:2]})
sess.run(optimizer, feed_dict={X: train_X[2:3], y: train_y[2:3]})
print(sess.run(predict, feed_dict={X: train_X[3:4], y: train_y[3:4]}))

输出

[[  1.64660544e+08]]

注: 当我将学习率降低到一个最小值1-8时，它就有点起作用了。尽管如此，当我在同一个数据集上运行回归时，较高的学习率仍然可以正常工作。那么，这里的问题是高学习率吗

cost = tf.reduce_sum(tf.square(output_layer - y))

在这一行，你在计算批次中每个张量的和，其中批次是一批平方差

如果您的批次具有大小为1的随机梯度下降，则这是可以的，因为您希望执行“小批次梯度下降批次大小>1”，因此您希望最小化批次的平均误差

因此，您希望最小化此函数：

cost = tf.reduce_mean(tf.square(output_layer - y))

tf.reduce_mean计算其输入中元素的平均值

如果批次大小为1，则公式的行为与您以前使用的公式完全相同，但当批次大小大于1时，它会计算批次的均方误差，这就是您想要的。

我正在使用批次大小为1的情况下运行。不管怎样，我试着做了你建议的更改，但预测值仍然变得太大了。我写的代码中有错误吗？