Python NUBA错误“序列项0:应为str实例，找到类型”_Python_Regression_Numba

Python NUBA错误“序列项0:应为str实例，找到类型”

python

Python NUBA错误“序列项0:应为str实例，找到类型”,python,regression,numba,Python,Regression,Numba,我想在多元回归分析中选择变量。我试着使用这个代码。问题是我想从50个变量中进行选择，这需要太多时间。我使用了Numba来加快速度，并编写了以下代码： @jit def forward_selected(data, response): """Linear model designed by forward selection. Parameters: ----------- data : pandas DataFrame with all possible p

我想在多元回归分析中选择变量。我试着使用这个代码。问题是我想从50个变量中进行选择，这需要太多时间。我使用了Numba来加快速度，并编写了以下代码：

@jit
def forward_selected(data, response):
"""Linear model designed by forward selection.

Parameters:
-----------
data : pandas DataFrame with all possible predictors and response

response: string, name of response column in data

Returns:
--------
model: an "optimal" fitted statsmodels linear model
       with an intercept
       selected by forward selection
       evaluated by adjusted R-squared
"""
remaining = set(data.columns)
remaining.remove(response)
selected = [str]
current_score, best_new_score = 0.0, 0.0
while remaining and current_score == best_new_score:
    scores_with_candidates = [str]
    for candidate in remaining:
        formula = "{} ~ {} + 1".format(response,
                                       ' + '.join(selected + [candidate]))
        score = smf.ols(formula, data).fit().rsquared_adj
        scores_with_candidates.append((score, candidate))
    scores_with_candidates.sort()
    best_new_score, best_candidate = scores_with_candidates.pop()
    if current_score < best_new_score:
        remaining.remove(best_candidate)
        selected.append(best_candidate)
        current_score = best_new_score
formula = "{} ~ {} + 1".format(response,
                               ' + '.join(selected))
model = smf.ols(formula, data).fit()
return model

model = forward_selected(df, col)

但它返回以下错误：

TypeError:序列项0:应为str实例，找到类型

请告诉我怎么修。如果你不理解我的问题，我很乐意在评论中提供更多信息

回溯最近一次呼叫上次：

文件~/PycharmProjects/anacondaenv/touhu_1.py，第164行，in

提交=预报

文件~/PycharmProjects/anacondaenv/touhu_1.py，第75行，预测中

型号=前进\选择的DF，col TypeError:序列项0:应为str实例，找到类型

我认为一个最好的办法，看看如果麻麻真的作为助推器的工作是尝试njit而不是jit装饰。njit不强制任何python模式，如果有任何东西退回到python，njit就会中断，因为python根本不提供任何速度优势。简短回答：除了np.ndarray，不要使用任何东西。所以没有字符串，没有元组，没有列表，也没有对非jitted函数的调用

所以我修正了错误：numba不允许在主函数体中使用空列表。。。不知道为什么可能是虫子？！但是如果你把它移到while块里面，它就会工作

结果是一样的，现在让我们看看它们的表现：

# with numba
10 loops, best of 3: 264 ms per loop

# without numba
10 loops, best of 3: 252 ms per loop

所以这和我预想的一模一样。使用python类型并调用未连接的外部函数，不会获得任何速度增益。使用numba可能会更快，但请确保通读numba文档并查看支持的内容：

调试中最有用的工具之一是回溯。请提供它。在这个函数中使用numba不太可能提高它的性能，因为我猜大部分工作都是由statsmodels完成的，而numba对此无能为力。通常，Numba只能加速执行纯标量或基于数组的操作的函数。如果有数据准备适合这个狭窄的重点领域，那么我会将其分离出来，然后以不同的功能将其传递给statsmodels。谢谢您和您的评论。如果你知道选择变量的方法，告诉我。

# With numba
sl ~ rk + yr + 1
0.835190760538

# Without numba
sl ~ rk + yr + 1
0.835190760538

# with numba
10 loops, best of 3: 264 ms per loop

# without numba
10 loops, best of 3: 252 ms per loop