Python 使用summary_out时将回归结果导出为csv文件
我正在使用雅虎的财务数据进行多元回归!来自法国的金融和法玛因素 单因素回归:Python 使用summary_out时将回归结果导出为csv文件,python,python-3.x,pandas,regression,statsmodels,Python,Python 3.x,Pandas,Regression,Statsmodels,我正在使用雅虎的财务数据进行多元回归!来自法国的金融和法玛因素 单因素回归: CAPM = sm.ols( formula = 'Exret ~ MKT', data=m).fit(cov_type='HAC',cov_kwds={'maxlags':1}) FF3 = sm.ols( formula = 'Exret ~ MKT + SMB + HML', data=m).fit(cov_type='HAC',cov_kwds={'maxlags':1}) 三因素回归: CAPM
CAPM = sm.ols( formula = 'Exret ~ MKT', data=m).fit(cov_type='HAC',cov_kwds={'maxlags':1})
FF3 = sm.ols( formula = 'Exret ~ MKT + SMB + HML',
data=m).fit(cov_type='HAC',cov_kwds={'maxlags':1})
三因素回归:
CAPM = sm.ols( formula = 'Exret ~ MKT', data=m).fit(cov_type='HAC',cov_kwds={'maxlags':1})
FF3 = sm.ols( formula = 'Exret ~ MKT + SMB + HML',
data=m).fit(cov_type='HAC',cov_kwds={'maxlags':1})
然后,我使用summary\u col
创建一个带有重要星号的表格:
dfoutput = summary_col([CAPM,FF3],stars=True,float_format='%0.4f',
model_names=['GOOG','GOOG'],info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),'Adjusted R2':lambda x: "{:.2f}".format(x.rsquared_adj)}, regressor_order = ['Intercept', 'MKT', 'SMB', 'HML'])
输出
dfoutput
Out[311]:
<class 'statsmodels.iolib.summary2.Summary'>
"""
=================================
GOOG I GOOG II
---------------------------------
Intercept -0.0009*** -0.0010***
(0.0003) (0.0003)
MKT 0.0098*** 0.0107***
(0.0003) (0.0003)
SMB -0.0033***
(0.0006)
HML -0.0063***
(0.0006)
N 1930 1930
Adjusted R2 0.37 0.42
=================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
df输出
Out[311]:
"""
=================================
咕咕I咕咕II
---------------------------------
截距-0.0009***-0.0010***
(0.0003) (0.0003)
MKT 0.0098***0.0107***
(0.0003) (0.0003)
SMB-0.0033***
(0.0006)
HML-0.0063***
(0.0006)
N 1930 1930
调整后的R2 0.37 0.42
=================================
括号中的标准错误。
*p只有在修改statsmodel
库中的文件summary2.py
时,才能将括号中的标准错误更改为t-统计
只需将该文件中的函数\u col\u params()
替换为以下版本:
def _col_params(result, float_format='%.4f', stars=True):
'''Stack coefficients and standard errors in single column
'''
# Extract parameters
res = summary_params(result)
# Format float
for col in res.columns[:3]:
res[col] = res[col].apply(lambda x: float_format % x)
# Std.Errors in parentheses
res.ix[:, 2] = '(' + res.ix[:, 2] + ')'
# Significance stars
if stars:
idx = res.ix[:, 3] < .1
res.ix[idx, 0] = res.ix[idx, 0] + '*'
idx = res.ix[:, 3] < .05
res.ix[idx, 0] = res.ix[idx, 0] + '*'
idx = res.ix[:, 3] < .01
res.ix[idx, 0] = res.ix[idx, 0] + '*'
# Stack Coefs and Std.Errors
res = res.ix[:, [0,2]]
res = res.stack()
res = pd.DataFrame(res)
res.columns = [str(result.model.endog_names)]
return res
显然,现在的结果包括t统计量,而不是标准误差:
print(results)
================================================
Model Model Model Model
(1) (2) (3) (4)
------------------------------------------------
cons 39.44*** 39.44*** 49.68*** 50.02***
(24.44) (24.32) (7.85) (7.80)
displacement 0.00
(0.44)
length -0.10* -0.09
(-1.67) (-1.63)
price -0.00 -0.00 -0.00
(-0.57) (-1.03) (-1.03)
weight -0.01*** -0.01*** -0.00* -0.00*
(-11.60) (-9.42) (-1.72) (-1.67)
N 74 74 74 74
R2 0.65 0.65 0.67 0.67
================================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
可以将括号中的标准错误更改为t统计,但前提是修改statsmodel
库中的文件summary2.py
只需将该文件中的函数\u col\u params()
替换为以下版本:
def _col_params(result, float_format='%.4f', stars=True):
'''Stack coefficients and standard errors in single column
'''
# Extract parameters
res = summary_params(result)
# Format float
for col in res.columns[:3]:
res[col] = res[col].apply(lambda x: float_format % x)
# Std.Errors in parentheses
res.ix[:, 2] = '(' + res.ix[:, 2] + ')'
# Significance stars
if stars:
idx = res.ix[:, 3] < .1
res.ix[idx, 0] = res.ix[idx, 0] + '*'
idx = res.ix[:, 3] < .05
res.ix[idx, 0] = res.ix[idx, 0] + '*'
idx = res.ix[:, 3] < .01
res.ix[idx, 0] = res.ix[idx, 0] + '*'
# Stack Coefs and Std.Errors
res = res.ix[:, [0,2]]
res = res.stack()
res = pd.DataFrame(res)
res.columns = [str(result.model.endog_names)]
return res
显然,现在的结果包括t统计量,而不是标准误差:
print(results)
================================================
Model Model Model Model
(1) (2) (3) (4)
------------------------------------------------
cons 39.44*** 39.44*** 49.68*** 50.02***
(24.44) (24.32) (7.85) (7.80)
displacement 0.00
(0.44)
length -0.10* -0.09
(-1.67) (-1.63)
price -0.00 -0.00 -0.00
(-0.57) (-1.03) (-1.03)
weight -0.01*** -0.01*** -0.00* -0.00*
(-11.60) (-9.42) (-1.72) (-1.67)
N 74 74 74 74
R2 0.65 0.65 0.67 0.67
================================================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01