Python Pandas Series.filter.Value返回的类型与numpy数组不同_Python_Numpy_Pandas_Scipy

Python Pandas Series.filter.Value返回的类型与numpy数组不同

python numpy pandas

Python Pandas Series.filter.Value返回的类型与numpy数组不同,python,numpy,pandas,scipy,Python,Numpy,Pandas,Scipy,我试图在两个数组上运行scipy.stats.entropy函数。它通过应用函数在数据帧的每一行上运行： def calculate_H(row): pk = np.histogram(row.filter(regex='stuff'), bins=16)[0] qk = row.filter(regex='other').values stats.entropy(pk, qk, base=2) df['DKL'] = df.apply(calculate_H, axi

我试图在两个数组上运行

scipy.stats.entropy

函数。它通过应用函数在数据帧的每一行上运行：

def calculate_H(row):
    pk = np.histogram(row.filter(regex='stuff'), bins=16)[0]
    qk = row.filter(regex='other').values
    stats.entropy(pk, qk, base=2)

df['DKL'] = df.apply(calculate_H, axis=1)

我得到以下错误：

TypeError: ufunc 'xlogy' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

（我还尝试了

qk=row[row.filter（regex='other'）.index].values

）

我知道问题出在

qk

上，我可以将另一个数组作为

qk

传递，它可以工作。问题是熊猫给我的东西说它是一个numpy阵列，但它不是一个numpy阵列。以下示例适用于所有工作：

qk1 = np.array([12024, 9643, 7681, 8193, 8012, 7846, 7615, 7484, 5966, 11484, 13627, 17749, 9820, 5336,4611, 3366])
qk2 = Series([12024, 9643, 7681, 8193, 8012, 7846, 7615, 7484, 5966, 11484, 13627, 17749, 9820, 5336,4611, 3366]).values
qk3 = df.filter(regex='other').iloc[0].values

如果我检查类型，例如

type（qk）==type（qk1）

它会给我True（all

numpy.ndarray

）。或者如果我使用

np.array_equals

，也为True

我得到的唯一提示是，当我打印出工作数组与不工作数组（不在底部工作）时会发生什么：

请注意，顶部的值之间的间距较大

TLDR；这两个表达式返回不同的结果

df.filter(regex='other').iloc[0].values
df.iloc[0].filter(regex='other').values

我怀疑

qk

是一个

对象

数组，而不是整数数组。在

计算\u H

中，尝试以下操作：

qk = row.filter(regex='other').values.astype(int)

（即，将值转换为整数数组）。

啊，是的！这就解决了问题。它为什么要更改数组中项目的类型？在我进行筛选之前，它是一个整数数组。我不知道

对象数组第一次出现的位置或原因——我不是熊猫的忠实用户。您可以向代码中添加一组print语句，并在每次创建新的pandas对象时仔细检查pandas对象的数据类型。结果表明，如果原始数据帧中的任何列是对象，则通过apply创建的序列是对象类型的。再次感谢。
qk = row.filter(regex='other').values.astype(int)