Python 使用张量流的线性回归_Python_Pandas_Numpy_Tensorflow

Python 使用张量流的线性回归

python pandas numpy tensorflow

Python 使用张量流的线性回归,python,pandas,numpy,tensorflow,Python,Pandas,Numpy,Tensorflow,我遵循了在教程中在线性回归的情况下，假设是一条直线，即h（x）=wx+b，其中w是称为权重的向量，b是称为偏差的标量。权重和偏差称为模型的参数。我们所需要做的就是根据给定的数据集估计w和b的值，从而得出的假设产生最小成本J，该成本J由以下成本函数定义其中m是给定数据集中的数据点数量。此成本函数也称为均方误差我的csv文件为： Date,Prices,DateNumeric 30/09/20,83.75,1000 30/12/20,86.47,1120 01/02/21,89.21,

我遵循了

在教程中

在线性回归的情况下，假设是一条直线，即h（x）=wx+b，其中w是称为权重的向量，b是称为偏差的标量。权重和偏差称为模型的参数。我们所需要做的就是根据给定的数据集估计w和b的值，从而得出的假设产生最小成本J，该成本J由以下成本函数定义

其中m是给定数据集中的数据点数量。此成本函数也称为均方误差

我的csv文件为：

Date,Prices,DateNumeric
30/09/20,83.75,1000
30/12/20,86.47,1120
01/02/21,89.21,1180
01/03/21,94.22,1210
01/04/21,93.59,1240
01/05/21,93.43,1270
07/05/21,94.3,1276
10/05/21,94.57,1279
11/05/21,94.85,1280
12/05/21,95.11,1281
14/05/21,95.41,1283
16/05/21,95.66,1285
18/05/21,95.94,1287
21/05/21,96.14,1290

我想预测商品的价格，这是价格栏。问题在于数据是非线性、非连续和非周期的。因此，我将其转换为DateNumeric列中的整数

在这里，20年9月30日的值被视为1000（初始值），20年12月30日的值被视为1120，因为它是在前一个日期后120天（3个月）的值

将tensorflow导入为tf
将numpy作为np导入
将matplotlib.pyplot作为plt导入
作为pd进口熊猫
学习率=0.01
纪元=200
n_样本=30
h=pd.read\u csv（'untitled.csv'）
#h、 形状
h、 总目（10）
从sklearn.model\u选择导入列车\u测试\u拆分
从sklearn.linear\u模型导入线性回归
从sklearn.metrics导入均方误差
x_序列，x_测试，y_序列，y_测试=序列测试分割（h.Prices，h.DateNumeric，测试大小=0.2）
打印（x_系列）
#plt.绘图（x\U列车、y\U列车）
#plt.绘图（h.价格，h.日期数字，'o'）
plt.绘图（h.价格，h.日期数字）
plt.show（）
X=tf.placeholder（tf.float32）
Y=tf.placeholder（tf.float32）
w=tf.Variable（np.random.randn（），name='weight'）
b=tf.Variable（np.random.randn（），name='bias'）
打印（b.值（））
预测=tf.add（tf.multiply（X，w），b）
成本=tf.减少总和（（预测-Y）**2/（2*n样本））
优化器=tf.train.GradientDescentOptimizer（学习率）。最小化（成本）
init=tf.global_variables_initializer（）
使用tf.Session（）作为sess：
sess.run（初始化）
#程序的数量将由历元决定
对于范围内的历元（历元）：
对于邮政编码的x，y（h.DateNumeric，h.Prices）：
#对于邮政编码的x，y（h.价格，h.日期数字）：
run（优化器，feed_dict={X:X，Y:Y}）
如果（历元%20）==0：
c=sess.run（成本，feed_dict={X:h.DateNumeric，Y:h.Prices}）
W=sess.run（W）
B=sess.run（B）
#打印（“成本，w，b”，成本，”，w，“，b）
打印（“成本：{}w:{}b:{}”。格式（c，w，b））
重量=sess.run（w）
偏差=sess.run（b）
#plt.绘图（x_列，y_列，o）
plt.绘图（h.日期数字，h.价格，'o'）
plt.绘图（h.DateNumeric，权重*h.DateNumeric+偏差）
#plt.绘图（x_列，重量*x_列+偏差）
plt.show（）

我无法正确预测成本、权重和偏差的值。显示的值为：

cost:40475074560.0 w:336.76763916015625 b:0.42306655645370483
cost:7042602293526528.0 w:-140444.890625 b:-125.27484130859375
cost:1.510587957600892e+21 w:65044804.0 b:55116.4609375
cost:3.583142054672409e+26 w:-31679014912.0 b:-26179644.0
cost:9.375889959585844e+31 w:16204882247680.0 b:13067821056.0
cost:2.7000283394741965e+37 w:-8696085782462464.0 b:-6847003623424.0
cost:inf w:4.710892753578361e+18 b:3691890536873984.0
cost:inf w:-2.5640481681247033e+21 b:-2.0047201720315412e+18
cost:inf w:1.3977489977961294e+24 b:1.0919898423421053e+21
cost:inf w:-7.631533130976661e+26 b:-5.95747219549254e+23
cost:inf w:4.1797650628306485e+29 b:3.257796829444394e+26
cost:inf w:-2.296399276256281e+32 b:-1.7870760811687127e+29
cost:inf w:1.2655991850303288e+35 b:9.833687923238823e+31
cost:inf w:-inf b:-5.432245785997481e+34
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan
cost:nan w:nan b:nan