Python Pearsonr:TypeError:找不到与ufunc add的指定签名和强制转换匹配的循环_Python_For Loop_Scipy_Pearson Correlation

Python Pearsonr:TypeError:找不到与ufunc add的指定签名和强制转换匹配的循环

python for-loop

Python Pearsonr:TypeError:找不到与ufunc add的指定签名和强制转换匹配的循环,python,for-loop,scipy,pearson-correlation,Python,For Loop,Scipy,Pearson Correlation,我有一个名为“df”的timeseries数据帧。它有一个柱和以下形状：（2000，1）。下面的数据帧头部显示了其结构： Weight Date 2004-06-01 1.9219 2004-06-02 1.8438 2004-06-03 1.8672 2004-06-04 1.7422 2004-06-07 1.8203 目标我试图使用“for循环”来计算“Weight”变量在不同时间段或时间标签上的百分比变化之间的相关性。这样做是为了评估在不同长

我有一个名为“df”的timeseries数据帧。它有一个柱和以下形状：（2000，1）。下面的数据帧头部显示了其结构：

            Weight
Date    
2004-06-01  1.9219
2004-06-02  1.8438
2004-06-03  1.8672
2004-06-04  1.7422
2004-06-07  1.8203

目标

我试图使用“for循环”来计算“Weight”变量在不同时间段或时间标签上的百分比变化之间的相关性。这样做是为了评估在不同长度的时间段内饲养牲畜的影响

可以在下面找到循环：

from scipy.stats.stats import pearsonr

# Loop for producing combinations of different timelags and holddays 
# and calculating the pearsonr correlation and p-value of each combination 

for timelags in [1, 5, 10, 25, 60, 120, 250]:
    for holddays in [1, 5, 10, 25, 60, 120, 250]:
        weight_change_lagged = df.pct_change(periods=timelags)
        weight_change_future = df.shift(-holddays).pct_change(periods=holddays)

        if (timelags >= holddays):
            indepSet=range(0, weight_change_lagged.shape[0], holddays)
        else:
            indepSet=range(0, weight_change_lagged.shape[0], timelags)

        weight_change_lagged = weight_change_lagged.iloc[indepSet]
        weight_change_future = weight_change_future.iloc[indepSet]

        not_na = (weight_change_lagged.notna() & weight_change_future.notna()).values

        (correlation, p-value)=pearsonr(weight_change_lagged[not_na], weight_change_future[not_na])
        print('%4i %4i %7.4f %7.4f' % (timelags, holddays, correlation, p-value))

循环执行良好，但在计算pearsonr相关性和p值时失败，即在本节：

(correlation, p-value)=pearsonr(weight_change_lagged[not_na], weight_change_future[not_na])

它会生成以下错误：

TypeError:未找到与指定签名和强制转换匹配的循环为ufunc add找到

有人知道如何解决我的问题吗？我发现没有找到符合我确切要求的答案。

通过随机修补，我成功地解决了我的问题，如下所示：

# Define an object containing observations that are not NA
not_na = (weight_change_lagged.notna() & weight_change_future.notna()).values

# Remove na values before inputting the data into the peasonr function (not within the function as I had done):
weight_change_lagged = weight_change_lagged[not_na]
weight_change_future = weight_change_future[not_na]

# Input Pandas Series of the Future and Lagged Variables into the function
(correlation, p-value)=pearsonr(weight_change_lagged['Weight'], weight_change_future['Weight'])

scipy的pearsonr包只接受数组或类似数组的输入。这意味着：

输入变量的Numpy数组可以工作
熊猫系列的输入变量工作

然而，变量的完整数据帧，即使它们包含一列，也不起作用

因此，我对代码中有问题的部分进行了如下编辑：

# Define an object containing observations that are not NA
not_na = (weight_change_lagged.notna() & weight_change_future.notna()).values

# Remove na values before inputting the data into the peasonr function (not within the function as I had done):
weight_change_lagged = weight_change_lagged[not_na]
weight_change_future = weight_change_future[not_na]

# Input Pandas Series of the Future and Lagged Variables into the function
(correlation, p-value)=pearsonr(weight_change_lagged['Weight'], weight_change_future['Weight'])

只要稍加修改，代码就可以顺利执行

注:

如果使用双方括号，如下所示，则输入的数据帧不是序列，pearsonr函数将抛出错误：

weight_change_future[['Weight']]

感谢所有试图帮助我的人，你的问题让我找到了答案。

在我的情况下，这不是数据类型问题，而是因为维度错误。多亏了这篇文章

这篇pearsonr来自哪里？听起来这些参数有一个它无法使用的

dtype

，即使是像

add

这样的简单操作。尝试

np.array（weight\u change\u lagged[not\u na]）

并报告它的

dtype

和

shape

。它来自Scipy stats:。将在尝试您的建议后向您汇报。也许你可以澄清一下——你和OP有同样的问题，但原因不同？如果是这样，这可能是一个有价值的额外答案，它可以帮助其他人，p即使有一个公认的答案。此外，由于互联网链接不一定是永久性的，如果你能总结一下你从链接中学到的东西（你也可以离开链接），那将非常有帮助。