Python 线性回归-减少自由度
我有一个熊猫数据框,其中的列如下Python 线性回归-减少自由度,python,numpy,statistics,pandas,curve-fitting,Python,Numpy,Statistics,Pandas,Curve Fitting,我有一个熊猫数据框,其中的列如下 Order Balance Profit cum (%) 我在做线性回归 model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'], x=df_closed['Order']) 问题是标准模型类似于(不通过原点的直线方程) 有两个自由度(a和b) 坡度(a): 及截取(二): 我想减少我的模型的自由度(从2到1),我想有一个像这样的模型 y = a * x 使用intercept关键字参数
Order Balance Profit cum (%)
我在做线性回归
model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'], x=df_closed['Order'])
问题是标准模型类似于(不通过原点的直线方程)
有两个自由度(a和b)
坡度(a):
及截取(二):
我想减少我的模型的自由度(从2到1),我想有一个像这样的模型
y = a * x
使用
intercept
关键字参数:
model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'],
x=df_closed['Order'],
intercept=False)
从文档:
In [65]: help(pandas.ols)
Help on function ols in module pandas.stats.interface:
ols(**kwargs)
[snip]
Parameters
----------
y: Series or DataFrame
See above for types
x: Series, DataFrame, dict of Series, dict of DataFrame, Panel
weights : Series or ndarray
The weights are presumed to be (proportional to) the inverse of the
variance of the observations. That is, if the variables are to be
transformed by 1/sqrt(W) you must supply weights = 1/W
intercept: bool
True if you want an intercept. Defaults to True.
nw_lags: None or int
Number of Newey-West lags. Defaults to None.
[snip]
以下是显示解决方案的示例:
#!/usr/bin/env python
import pandas as pd
import matplotlib.pylab as plt
import numpy as np
data = [
(0.2, 1.3),
(1.3, 3.9),
(2.1, 4.8),
(2.9,5.5),
(3.3,6.9)
]
df = pd.DataFrame(data, columns=['X', 'Y'])
print(df)
# 2 degrees of freedom : slope / intercept
model_with_intercept = pd.ols(y=df['Y'], x=df['X'], intercept=True)
df['Y_fit_with_intercept'] = model_with_intercept.y_fitted
# 1 degree of freedom : slope ; intersept=0
model_no_intercept = pd.ols(y=df['Y'], x=df['X'], intercept=False)
df['Y_fit_no_intercept'] = model_no_intercept.y_fitted
# 1 degree of freedom : slope ; intersept=offset
offset = -1
df['Yoffset'] = df['Y'] - offset
model_with_offset = pd.ols(y=df['Yoffset'], x=df['X'], intercept=False)
df['Y_fit_offset'] = model_with_offset.y_fitted + offset
print(model_with_intercept)
print(model_no_intercept)
print(model_with_offset)
df.plot(x='X', y=['Y', 'Y_fit_with_intercept', 'Y_fit_no_intercept', 'Y_fit_offset'])
plt.show()
非常感谢(为解决方案和帮助提示)!我还有一个问题,但我不知道是否可以在这里问。。。如果我想将intercept设置为给定值(而不是0),我应该怎么做。(我还将把自由度从2减少到1)@FemtoTrader:我认为
ols
没有这个功能。但是,考虑到它是最小二乘法,您可以从y
中减去该截距,然后使用ols
和intercept=False
。应该是一样的。不,那不一样!如果强制直线通过给定的坡度,则坡度不同point@FemtoTrader:否,在最小二乘法中,使用固定的B
将a*x+B
拟合到y
与将a*x
拟合到y-B
相同。
model_profit_tr = pd.ols(y=df_closed['Profit cum (%)'],
x=df_closed['Order'],
intercept=False)
In [65]: help(pandas.ols)
Help on function ols in module pandas.stats.interface:
ols(**kwargs)
[snip]
Parameters
----------
y: Series or DataFrame
See above for types
x: Series, DataFrame, dict of Series, dict of DataFrame, Panel
weights : Series or ndarray
The weights are presumed to be (proportional to) the inverse of the
variance of the observations. That is, if the variables are to be
transformed by 1/sqrt(W) you must supply weights = 1/W
intercept: bool
True if you want an intercept. Defaults to True.
nw_lags: None or int
Number of Newey-West lags. Defaults to None.
[snip]
#!/usr/bin/env python
import pandas as pd
import matplotlib.pylab as plt
import numpy as np
data = [
(0.2, 1.3),
(1.3, 3.9),
(2.1, 4.8),
(2.9,5.5),
(3.3,6.9)
]
df = pd.DataFrame(data, columns=['X', 'Y'])
print(df)
# 2 degrees of freedom : slope / intercept
model_with_intercept = pd.ols(y=df['Y'], x=df['X'], intercept=True)
df['Y_fit_with_intercept'] = model_with_intercept.y_fitted
# 1 degree of freedom : slope ; intersept=0
model_no_intercept = pd.ols(y=df['Y'], x=df['X'], intercept=False)
df['Y_fit_no_intercept'] = model_no_intercept.y_fitted
# 1 degree of freedom : slope ; intersept=offset
offset = -1
df['Yoffset'] = df['Y'] - offset
model_with_offset = pd.ols(y=df['Yoffset'], x=df['X'], intercept=False)
df['Y_fit_offset'] = model_with_offset.y_fitted + offset
print(model_with_intercept)
print(model_no_intercept)
print(model_with_offset)
df.plot(x='X', y=['Y', 'Y_fit_with_intercept', 'Y_fit_no_intercept', 'Y_fit_offset'])
plt.show()