Python 用stats模型预测置信区间

Python 用stats模型预测置信区间,python,statsmodels,Python,Statsmodels,我正在构建一个线性模型,如下所示: import statsmodels.api as sm from statsmodels.stats.outliers_influence import summary_table import numpy as np import random x = np.arange(1,101, 1) y = random.sample(range(1,1000), 100) X = sm.add_constant(x) regr = sm.OLS(y, X)

我正在构建一个线性模型,如下所示:

import statsmodels.api as sm
from statsmodels.stats.outliers_influence import summary_table
import numpy as np
import random

x = np.arange(1,101, 1)
y = random.sample(range(1,1000), 100)

X = sm.add_constant(x)
regr = sm.OLS(y, X)
fit = regr.fit()

st, data, ss2 = summary_table(fit, alpha=0.05)
我可以根据
数据
确定标准误差和置信区间

现在我想预测新数据的置信区间是多少,我试着这样做:

new_data = [102, 103, 104, 105]

fit.get_prediction(new_data)
但这也带来了:

Traceback (most recent call last):

  File "<ipython-input-168-372d2610946d>", line 14, in <module>
    fit.get_prediction(new)

  File "/Users/spotter/anaconda3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py", line 2138, in get_prediction
    weights=weights, row_labels=row_labels, **kwds)

  File "/Users/user/anaconda3/lib/python3.6/site-packages/statsmodels/regression/_prediction.py", line 163, in get_prediction
    predicted_mean = self.model.predict(self.params, exog, **pred_kwds)

  File "/Users/user/anaconda3/lib/python3.6/site-packages/statsmodels/regression/linear_model.py", line 261, in predict
    return np.dot(exog, params)

ValueError: shapes (1,4) and (2,) not aligned: 4 (dim 1) != 2 (dim 0
回溯(最近一次呼叫最后一次):
文件“”,第14行,在
fit.get_预测(新)
文件“/Users/spotter/anaconda3/lib/python3.6/site packages/statsmodels/regression/linear_model.py”,第2138行,在get_prediction中
重量=重量,行标签=行标签,**kwds)
文件“/Users/user/anaconda3/lib/python3.6/site packages/statsmodels/regression/_prediction.py”,第163行,在get_prediction中
预测的平均值=self.model.predict(self.params,exog,**预测值)
文件“/Users/user/anaconda3/lib/python3.6/site packages/statsmodels/regression/linear_model.py”,第261行,在predict中
返回np.点(exog,参数)
ValueError:形状(1,4)和(2,)未对齐:4(尺寸1)!=2(尺寸0

由于您使用截距训练模型,因此在创建
新数据时也需要包含截距(=添加一列1)

new_data = sm.add_constant([102, 103, 104, 105])
result = fit.get_prediction(new_data)
result.conf_int()