Python 计算与二元数据y=x行的偏差

Python 计算与二元数据y=x行的偏差,python,pandas,statistics,Python,Pandas,Statistics,如下图所示,我有双变量数据,在理想情况下,这些数据将拟合直线(y=x)。在Python中,如何计算每个点与该直线(y=x)的偏差?是否有可能量化直线拟合的平均偏差?我只是想用一种方法来量化我的数据是如何从1:1的比例变化的。如有任何建议,我们将不胜感激。我正在处理熊猫数据框中设置的数据。谢谢 此代码将计算每个点与回归线和y=x线的偏差,还将绘制标准偏差以及回归和y=x函数 from sklearn.linear_model import LinearRegression import stati

如下图所示,我有双变量数据,在理想情况下,这些数据将拟合直线(y=x)。在Python中,如何计算每个点与该直线(y=x)的偏差?是否有可能量化直线拟合的平均偏差?我只是想用一种方法来量化我的数据是如何从1:1的比例变化的。如有任何建议,我们将不胜感激。我正在处理熊猫数据框中设置的数据。谢谢


此代码将计算每个点与回归线和y=x线的偏差,还将绘制标准偏差以及回归和y=x函数

from sklearn.linear_model import LinearRegression
import statistics as stat


#Set the x and y values
x=np.random.rand(50)
y=2*x-1+np.random.rand(50)



"""
calculate the deviation from y=x at each point
"""

xp=np.linspace(0,1,50)
yp=xp
deviationxy=(y-yp)

listpos=[]
listneg=[]

#Calculate the ratio of the points
[listpos.append(i) for i in deviationxy if i >0]
[listneg.append(i) for i in deviationxy if i <0]

if len(listpos)==len(listneg):
    print("The ratio is 1:1")
else:
    above=(len(listpos)/len(deviationxy))*100
    below=(len(listneg)/len(deviationxy))*100
    print("{0}% of the values are above the line y=x ; {1}% of the values are below the line".format(above,below))


"""
Implement the regression
"""

#coerce the x values in the shape [n_samples,n_features]
X=x[:,np.newaxis]


#inistantiate the model
model=LinearRegression(fit_intercept=True)

#fit the model
model.fit(X,y)


#print the dots and the regression function as well as the fumction x=y
fig,ax=plt.subplots(figsize=(10,10))
ax.scatter(x,y)
ax.plot(x,model.coef_*x+model.intercept_,":r")
ax.plot(xp,yp,".k")


#calculate the devaition from regression at each point
deviation=np.sqrt((y-(model.coef_*x+model.intercept_))**2)
print(deviation)#returns the deviation for each point



#plot the standard_deviation from the regression line

standard_deviation=stat.stdev(x)

std_dev=[standard_deviation,-standard_deviation]
[ax.plot(x,(model.coef_*x+model.intercept_)+standard,"--b") for standard in std_dev]

plt.show()
从sklearn.linear\u模型导入线性回归
将统计数据作为统计数据导入
#设置x和y值
x=np.rand.rand(50)
y=2*x-1+np.随机数.兰德(50)
"""
计算每个点与y=x的偏差
"""
xp=np.linspace(0,1,50)
yp=xp
偏差XY=(y-yp)
listpos=[]
listneg=[]
#计算点数的比率
[如果i>0,则为偏差xy中的i追加(i)]

[listneg.如果i此代码将计算每个点与回归线以及与线y=x的偏差,并绘制标准偏差以及回归和y=x函数

from sklearn.linear_model import LinearRegression
import statistics as stat


#Set the x and y values
x=np.random.rand(50)
y=2*x-1+np.random.rand(50)



"""
calculate the deviation from y=x at each point
"""

xp=np.linspace(0,1,50)
yp=xp
deviationxy=(y-yp)

listpos=[]
listneg=[]

#Calculate the ratio of the points
[listpos.append(i) for i in deviationxy if i >0]
[listneg.append(i) for i in deviationxy if i <0]

if len(listpos)==len(listneg):
    print("The ratio is 1:1")
else:
    above=(len(listpos)/len(deviationxy))*100
    below=(len(listneg)/len(deviationxy))*100
    print("{0}% of the values are above the line y=x ; {1}% of the values are below the line".format(above,below))


"""
Implement the regression
"""

#coerce the x values in the shape [n_samples,n_features]
X=x[:,np.newaxis]


#inistantiate the model
model=LinearRegression(fit_intercept=True)

#fit the model
model.fit(X,y)


#print the dots and the regression function as well as the fumction x=y
fig,ax=plt.subplots(figsize=(10,10))
ax.scatter(x,y)
ax.plot(x,model.coef_*x+model.intercept_,":r")
ax.plot(xp,yp,".k")


#calculate the devaition from regression at each point
deviation=np.sqrt((y-(model.coef_*x+model.intercept_))**2)
print(deviation)#returns the deviation for each point



#plot the standard_deviation from the regression line

standard_deviation=stat.stdev(x)

std_dev=[standard_deviation,-standard_deviation]
[ax.plot(x,(model.coef_*x+model.intercept_)+standard,"--b") for standard in std_dev]

plt.show()
从sklearn.linear\u模型导入线性回归
将统计数据作为统计数据导入
#设置x和y值
x=np.rand.rand(50)
y=2*x-1+np.随机数.兰德(50)
"""
计算每个点与y=x的偏差
"""
xp=np.linspace(0,1,50)
yp=xp
偏差XY=(y-yp)
listpos=[]
listneg=[]
#计算点数的比率
[如果i>0,则为偏差xy中的i追加(i)]

[listneg.append(i)for i in deviationxy如果我是指
y_hat-y
?通常你会看到使用,或者如果你想要MAE…在讨论回归问题时,一个.OP,(预测值)减去(实际值)通常称为残差。尝试搜索“残差”或“残差”在Pandas或Numpy或Scipy中。你的意思是
y_hat-y
?通常你会看到使用,或者如果你想要MAE…在讨论回归问题时,一个.OP(预测值)减去(实际值)通常被称为残差。尝试在Pandas或Numpy或Scipy中搜索“残差”或“残差”。