Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/286.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/cmake/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何解释statsmodels和普通最小二乘模型的结果?_Python_Statistics_Regression_Linear Regression_Statsmodels - Fatal编程技术网

Python 如何解释statsmodels和普通最小二乘模型的结果?

Python 如何解释statsmodels和普通最小二乘模型的结果?,python,statistics,regression,linear-regression,statsmodels,Python,Statistics,Regression,Linear Regression,Statsmodels,受此启发,我想了解statsmodel的OLS的输出 我根据今天的要求修改了代码,并想了解如何解释*.outlier_test()的响应-分别是,是否真的bonf(p)

受此启发,我想了解
statsmodel
的OLS的输出

我根据今天的要求修改了代码,并想了解如何解释
*.outlier_test()
的响应-分别是,是否真的
bonf(p)
<0.5是识别异常值的正确方法

守则:

from random import random
import statsmodels.api as smapi
from statsmodels.formula.api import ols
import statsmodels.graphics as smgraphics
# Make data #
x = list(range(30))
y = [y*(10+random())+200 for y in x]
# Add outlier #
x.insert(6,15)
y.insert(6,220)
x.insert(6,16)
y.insert(6,295)
# Make fit #
regression = ols("data ~ x", data=dict(data=y, x=x)).fit()
# Find outliers #
test = regression.outlier_test()
print("test.columns:", test.columns)
print(test)

outliers = ((x[i],y[i]) for i,t in enumerate(test.iloc[:,2]) if t < 0.5)
print ('Outliers: ', list(outliers))
# Figure #
figure = smgraphics.regressionplots.plot_fit(regression, 1)
# Add line #
smgraphics.regressionplots.abline_plot(model_results=regression, ax=figure.axes[0])
figure.show()
以及:

所以我的问题是:

当使用statsmodels普通最小二乘模型时,
bonf(p)
<0.5真的是识别异常值的正确度量吗?

test.columns: Index(['student_resid', 'unadj_p', 'bonf(p)'], dtype='object')

    student_resid       unadj_p       bonf(p)
0        0.256226  7.995850e-01  1.000000e+00
1        0.235436  8.155247e-01  1.000000e+00
2        0.266506  7.917355e-01  1.000000e+00
3        0.195602  8.462860e-01  1.000000e+00
4        0.206646  8.377301e-01  1.000000e+00
5        0.235760  8.152759e-01  1.000000e+00
6       -2.670250  1.229206e-02  3.933460e-01
7       -9.404263  2.609308e-10  8.349786e-09
8        0.160577  8.735400e-01  1.000000e+00
9        0.317017  7.535015e-01  1.000000e+00
10       0.120925  9.045843e-01  1.000000e+00
11       0.249872  8.044476e-01  1.000000e+00
12       0.250744  8.037804e-01  1.000000e+00
13       0.399508  6.924460e-01  1.000000e+00
14       0.313912  7.558347e-01  1.000000e+00
15       0.187027  8.529415e-01  1.000000e+00
16       0.019263  9.847634e-01  1.000000e+00
17       0.038839  9.692847e-01  1.000000e+00
18       0.015481  9.877546e-01  1.000000e+00
19       0.417676  6.792601e-01  1.000000e+00
20       0.153612  8.789799e-01  1.000000e+00
21       0.201890  8.414121e-01  1.000000e+00
22       0.540464  5.930042e-01  1.000000e+00
23       0.216489  8.301220e-01  1.000000e+00
24      -0.156133  8.770102e-01  1.000000e+00
25       0.477092  6.368722e-01  1.000000e+00
26       0.246855  8.067600e-01  1.000000e+00
27       0.494958  6.243592e-01  1.000000e+00
28       0.413796  6.820681e-01  1.000000e+00
29       0.067460  9.466782e-01  1.000000e+00
30       0.165854  8.694224e-01  1.000000e+00
31       0.511132  6.131286e-01  1.000000e+00
Outliers:  [(16, 295), (15, 220)]