Python 计算与二元数据y=x行的偏差_Python_Pandas_Statistics

Python 计算与二元数据y=x行的偏差

python pandas statistics

Python 计算与二元数据y=x行的偏差,python,pandas,statistics,Python,Pandas,Statistics,如下图所示，我有双变量数据，在理想情况下，这些数据将拟合直线（y=x）。在Python中，如何计算每个点与该直线（y=x）的偏差？是否有可能量化直线拟合的平均偏差？我只是想用一种方法来量化我的数据是如何从1:1的比例变化的。如有任何建议，我们将不胜感激。我正在处理熊猫数据框中设置的数据。谢谢此代码将计算每个点与回归线和y=x线的偏差，还将绘制标准偏差以及回归和y=x函数 from sklearn.linear_model import LinearRegression import stati

如下图所示，我有双变量数据，在理想情况下，这些数据将拟合直线（y=x）。在Python中，如何计算每个点与该直线（y=x）的偏差？是否有可能量化直线拟合的平均偏差？我只是想用一种方法来量化我的数据是如何从1:1的比例变化的。如有任何建议，我们将不胜感激。我正在处理熊猫数据框中设置的数据。谢谢

此代码将计算每个点与回归线和y=x线的偏差，还将绘制标准偏差以及回归和y=x函数

from sklearn.linear_model import LinearRegression
import statistics as stat


#Set the x and y values
x=np.random.rand(50)
y=2*x-1+np.random.rand(50)



"""
calculate the deviation from y=x at each point
"""

xp=np.linspace(0,1,50)
yp=xp
deviationxy=(y-yp)

listpos=[]
listneg=[]

#Calculate the ratio of the points
[listpos.append(i) for i in deviationxy if i >0]
[listneg.append(i) for i in deviationxy if i <0]

if len(listpos)==len(listneg):
    print("The ratio is 1:1")
else:
    above=(len(listpos)/len(deviationxy))*100
    below=(len(listneg)/len(deviationxy))*100
    print("{0}% of the values are above the line y=x ; {1}% of the values are below the line".format(above,below))


"""
Implement the regression
"""

#coerce the x values in the shape [n_samples,n_features]
X=x[:,np.newaxis]


#inistantiate the model
model=LinearRegression(fit_intercept=True)

#fit the model
model.fit(X,y)


#print the dots and the regression function as well as the fumction x=y
fig,ax=plt.subplots(figsize=(10,10))
ax.scatter(x,y)
ax.plot(x,model.coef_*x+model.intercept_,":r")
ax.plot(xp,yp,".k")


#calculate the devaition from regression at each point
deviation=np.sqrt((y-(model.coef_*x+model.intercept_))**2)
print(deviation)#returns the deviation for each point



#plot the standard_deviation from the regression line

standard_deviation=stat.stdev(x)

std_dev=[standard_deviation,-standard_deviation]
[ax.plot(x,(model.coef_*x+model.intercept_)+standard,"--b") for standard in std_dev]

plt.show()

从sklearn.linear\u模型导入线性回归
将统计数据作为统计数据导入
#设置x和y值
x=np.rand.rand（50）
y=2*x-1+np.随机数.兰德（50）
"""
计算每个点与y=x的偏差
"""
xp=np.linspace（0,1,50）
yp=xp
偏差XY=（y-yp）
listpos=[]
listneg=[]
#计算点数的比率
[如果i>0，则为偏差xy中的i追加（i）]
[listneg.如果i此代码将计算每个点与回归线以及与线y=x的偏差，并绘制标准偏差以及回归和y=x函数
from sklearn.linear_model import LinearRegression
import statistics as stat


#Set the x and y values
x=np.random.rand(50)
y=2*x-1+np.random.rand(50)



"""
calculate the deviation from y=x at each point
"""

xp=np.linspace(0,1,50)
yp=xp
deviationxy=(y-yp)

listpos=[]
listneg=[]

#Calculate the ratio of the points
[listpos.append(i) for i in deviationxy if i >0]
[listneg.append(i) for i in deviationxy if i <0]

if len(listpos)==len(listneg):
    print("The ratio is 1:1")
else:
    above=(len(listpos)/len(deviationxy))*100
    below=(len(listneg)/len(deviationxy))*100
    print("{0}% of the values are above the line y=x ; {1}% of the values are below the line".format(above,below))


"""
Implement the regression
"""

#coerce the x values in the shape [n_samples,n_features]
X=x[:,np.newaxis]


#inistantiate the model
model=LinearRegression(fit_intercept=True)

#fit the model
model.fit(X,y)


#print the dots and the regression function as well as the fumction x=y
fig,ax=plt.subplots(figsize=(10,10))
ax.scatter(x,y)
ax.plot(x,model.coef_*x+model.intercept_,":r")
ax.plot(xp,yp,".k")


#calculate the devaition from regression at each point
deviation=np.sqrt((y-(model.coef_*x+model.intercept_))**2)
print(deviation)#returns the deviation for each point



#plot the standard_deviation from the regression line

standard_deviation=stat.stdev(x)

std_dev=[standard_deviation,-standard_deviation]
[ax.plot(x,(model.coef_*x+model.intercept_)+standard,"--b") for standard in std_dev]

plt.show()

从sklearn.linear\u模型导入线性回归
将统计数据作为统计数据导入
#设置x和y值
x=np.rand.rand（50）
y=2*x-1+np.随机数.兰德（50）
"""
计算每个点与y=x的偏差
"""
xp=np.linspace（0,1,50）
yp=xp
偏差XY=（y-yp）
listpos=[]
listneg=[]
#计算点数的比率
[如果i>0，则为偏差xy中的i追加（i）]
[listneg.append（i）for i in deviationxy如果我是指y_hat-y
？通常你会看到使用，或者如果你想要MAE…在讨论回归问题时，一个.OP，（预测值）减去（实际值）通常称为残差。尝试搜索“残差”或“残差”在Pandas或Numpy或Scipy中。你的意思是y_hat-y
？通常你会看到使用，或者如果你想要MAE…在讨论回归问题时，一个.OP（预测值）减去（实际值）通常被称为残差。尝试在Pandas或Numpy或Scipy中搜索“残差”或“残差”。