Machine learning 均方误差不随年代数递减?

Machine learning 均方误差不随年代数递减?,machine-learning,tensorflow,gradient-descent,Machine Learning,Tensorflow,Gradient Descent,这是使用tensorflow实现批量梯度下降 当我运行此代码时,MSE保持不变 import tensorflow as tf from sklearn.preprocessing import StandardScaler import numpy as np from sklearn.datasets import fetch_california_housing housing=fetch_california_housing() std=StandardScaler() scaled

这是使用tensorflow实现批量梯度下降

当我运行此代码时,MSE保持不变

import tensorflow as tf
from sklearn.preprocessing import StandardScaler
import numpy as np
from sklearn.datasets import fetch_california_housing

housing=fetch_california_housing()

std=StandardScaler()
scaled_housing_data=std.fit_transform(housing.data)

m,n=scaled_housing_data.shape
scaled_housing_data.shape

scaled_housing_data_with_bias=np.c_[np.ones((m,1)),scaled_housing_data]

n_epochs=1000
n_learning_rate=0.01

x=tf.constant(scaled_housing_data_with_bias,dtype=tf.float32)
y=tf.constant(housing.target.reshape(-1,1),dtype=tf.float32)
theta=tf.Variable(tf.random_uniform([n+1,1],-1.0,1.0,seed=42))
y_pred=tf.matmul(x,theta)

error=y_pred-y
mse=tf.reduce_mean(tf.square(error))
gradients=2/m*tf.matmul(tf.transpose(x),error)

training_op=tf.assign(theta,theta-n_learning_rate*gradients)

init=tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
        sess.run(training_op)

    best_theta = theta.eval()
输出

('Epoch', 0, 'MSE =', 2.7544272)
('Epoch', 100, 'MSE =', 2.7544272)
('Epoch', 200, 'MSE =', 2.7544272)
('Epoch', 300, 'MSE =', 2.7544272)
('Epoch', 400, 'MSE =', 2.7544272)
('Epoch', 500, 'MSE =', 2.7544272)
('Epoch', 600, 'MSE =', 2.7544272)
('Epoch', 700, 'MSE =', 2.7544272)
('Epoch', 800, 'MSE =', 2.7544272)
('Epoch', 900, 'MSE =', 2.7544272)
无论发生什么情况,均方误差(MSE)都保持不变。
请帮忙

也许你应该再试一次。我只需复制您的代码并运行,损失将正确减少。
输出:


如果MSE相同,则表示θ没有得到更新,这意味着梯度为零。更改此行并检查:

gradients=2.0/m*tf.matmul(tf.transpose(x),error) # integer division (2/m) causes zero
gradients=2.0/m*tf.matmul(tf.transpose(x),error) # integer division (2/m) causes zero