Python 版本为0.12.0的错误

Python 版本为0.12.0的错误,python,pandas,typeerror,linear-regression,Python,Pandas,Typeerror,Linear Regression,我有以下用于执行滚动OLS计算的示例数据(这里我是通过调试器执行的): 当我尝试滚动OLS时: (Pdb) pandas.ols(y=df[lhs], x=df[rhs], window=window, min_periods=min_periods, intercept=intercept) *** TypeError: unsupported operand type(s) for +: 'slice' and 'int' 但是,如果只是在整个数据范围内尝试常规OLS,似乎可以: (Pdb

我有以下用于执行滚动OLS计算的示例数据(这里我是通过调试器执行的):

当我尝试滚动OLS时:

(Pdb) pandas.ols(y=df[lhs], x=df[rhs], window=window, min_periods=min_periods, intercept=intercept)
*** TypeError: unsupported operand type(s) for +: 'slice' and 'int'
但是,如果只是在整个数据范围内尝试常规OLS,似乎可以:

(Pdb) pandas.ols(y=df[lhs], x=df[rhs], intercept=intercept)

-------------------------Summary of Regression Analysis-------------------------

Formula: Y ~ <Yield> + <intercept>

Number of Observations:         38
Number of Degrees of Freedom:   2

R-squared:         0.0226
Adj R-squared:    -0.0046

Rmse:             12.5182

F-stat (1, 36):     0.8321, p-value:     0.3677

Degrees of Freedom: model 1, resid 36

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
--------------------------------------------------------------------------------
         Yield   146.6702   160.7874       0.91     0.3677  -168.4732   461.8135
     intercept    -4.6083     6.0652      -0.76     0.4523   -16.4961     7.2795
---------------------------------End of Summary---------------------------------
已添加

有问题的代码似乎位于Pandas 0.12中的
ols.py
函数中

def _cum_xx(self, x):
    dates = self._index
    K = len(x.columns)
    valid = self._time_has_obs
    cum_xx = []

    slicer = lambda df, dt: df.truncate(dt, dt).values
    if not self._panel_model:
        _get_index = x.index.get_loc

        def slicer(df, dt):
            i = _get_index(dt)
            return df.values[i:i + 1, :]

    last = np.zeros((K, K))

    for i, date in enumerate(dates):
        if not valid[i]:
            cum_xx.append(last)
            continue

        x_slice = slicer(x, date)
        xx = last = last + np.dot(x_slice.T, x_slice)
        cum_xx.append(xx)

    return cum_xx
\u get\u index
x.index.get\u loc
的代理,表示它可以返回切片对象。但是下面的代码假设通过这种方式获得的值
i
是一个整数,因此
i+1
是有意义的

我找到了
get\u loc
的源代码。原来
x.index.get\u loc
x.index.\u engine.get\u loc
的代理。在我的例子中,发生错误时相关的
索引的
\u engine\u type
就是
ObjectEngine
,并且
get\u loc
在这里定义:

cpdef get_loc(self, object val):
    if is_definitely_invalid_key(val):
        raise TypeError

    if self.over_size_threshold and self.is_monotonic:
        if not self.is_unique:
            return self._get_loc_duplicates(val)
        values = self._get_index_values()
        loc = _bin_search(values, val) # .searchsorted(val, side='left')
        if util.get_value_at(values, loc) != val:
            raise KeyError(val)
        return loc

    self._ensure_mapping_populated()
    if not self.unique:
        return self._get_loc_duplicates(val)

    self._check_type(val)

    try:
        return self.mapping.get_item(val)
    except TypeError:
        raise KeyError(val)

我正在研究何时/为什么
get_loc
为我返回一个切片(索引中绝对没有重复项,这是文档建议的唯一方法)。同时,沿着这些思路提出的任何建议都会很有帮助。

是不是你的索引不是数字?
def _cum_xx(self, x):
    dates = self._index
    K = len(x.columns)
    valid = self._time_has_obs
    cum_xx = []

    slicer = lambda df, dt: df.truncate(dt, dt).values
    if not self._panel_model:
        _get_index = x.index.get_loc

        def slicer(df, dt):
            i = _get_index(dt)
            return df.values[i:i + 1, :]

    last = np.zeros((K, K))

    for i, date in enumerate(dates):
        if not valid[i]:
            cum_xx.append(last)
            continue

        x_slice = slicer(x, date)
        xx = last = last + np.dot(x_slice.T, x_slice)
        cum_xx.append(xx)

    return cum_xx
cpdef get_loc(self, object val):
    if is_definitely_invalid_key(val):
        raise TypeError

    if self.over_size_threshold and self.is_monotonic:
        if not self.is_unique:
            return self._get_loc_duplicates(val)
        values = self._get_index_values()
        loc = _bin_search(values, val) # .searchsorted(val, side='left')
        if util.get_value_at(values, loc) != val:
            raise KeyError(val)
        return loc

    self._ensure_mapping_populated()
    if not self.unique:
        return self._get_loc_duplicates(val)

    self._check_type(val)

    try:
        return self.mapping.get_item(val)
    except TypeError:
        raise KeyError(val)