Python 基于groupby的OLS回归
我想使用pandas和groupby运行OLS回归 我正在尝试以下代码:Python 基于groupby的OLS回归,python,pandas,statsmodels,Python,Pandas,Statsmodels,我想使用pandas和groupby运行OLS回归 我正在尝试以下代码: import pandas as pd from pandas.stats.api import ols df=pd.read_csv(r'F:\File.csv') result=df.groupby(['FID']).apply(lambda x: ols(y=df[x['MEAN']], x=df[x['Accum_Prcp'],x['Accum_HDD']])) print result 但这也带来了: Fil
import pandas as pd
from pandas.stats.api import ols
df=pd.read_csv(r'F:\File.csv')
result=df.groupby(['FID']).apply(lambda x: ols(y=df[x['MEAN']], x=df[x['Accum_Prcp'],x['Accum_HDD']]))
print result
但这也带来了:
File "C:\Users\spotter\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\indexing.py", line 1150, in _convert_to_indexer
raise KeyError('%s not in index' % objarr[mask])
KeyError: '[ 0.84978328 0.72115778 0.53965104 0.52955655 0.73372541 0.64617074\n 0.60040938 0.7147218 0.65533535 0.57980322 0.57382068 0.56543435\n 0.70740831 0.9245337 0.54859569 0.6789395 0.7086157 0.3835853\n 0.54924104 0.80813778 0.83758118 0.22673391 0.26594087 0.63650468\n 0.89889911 0.38324657 0.30235986 0.62922678 0.55219822 0.55950705\n 0.71137557 0.53631811 0.70158798 0.87116361 0.93751381 0.91125518\n 0.80020908 0.75301262 0.82391046 0.77483673 0.63069573 0.44954455\n 0.83578862 0.56338649 0.64236039 0.93270243 0.93077291 0.83847668\n 0.8268959 0.85400317 0.74319769 0.94803537 0.97484929 0.45366017\n 0.80823694 0.82028051 0.63960395 0.63015722 0.73132888 0.55570184\n 0.83265402 0.75009687 0.58207032 0.92064804 0.91058008 0.86726397\n 0.89204098 0.95573514 0.75704367 0.80786363 0.87448548 0.7553715\n 0.88965962 0.82828493 0.82423891 0.81034742 0.90104876 0.78875473\n 0.97369268] not in index'
我的语法有什么不正确的地方吗
在没有groupby的情况下执行此操作将是这样的:
result = ols(y=df['MEAN'], x=df[['Accum_HDD','Accum_Prcp']])
FID Image_Date MEAN Accum_Prcp Accum_HDD
1 19920506 2.0 500.0 1000.0
1 19930506 1.7 450.0 1050.0
2 19920506 2.7 456.0 992.0
2 19930506 1.9 376.0 800.0
这是正确的
我的数据框看起来像这样:
result = ols(y=df['MEAN'], x=df[['Accum_HDD','Accum_Prcp']])
FID Image_Date MEAN Accum_Prcp Accum_HDD
1 19920506 2.0 500.0 1000.0
1 19930506 1.7 450.0 1050.0
2 19920506 2.7 456.0 992.0
2 19930506 1.9 376.0 800.0
尝试:
返回:
FID\u3\n----------------------回归分析总结。。。4\n--------------------回归分析总结。。。5\n--------------------回归分析总结。。。6\n--------------------回归分析总结。。。7\n--------------------回归汇总…
,因此基本上它返回groupby中每个项目的\n
。或者,我可能无法打印出每个项目并在python环境中查看它们。我可能希望迭代groupby
对象。