Python：使用滚动+；与需要两列作为参数的函数一起应用_Python_Pandas_Function_Multiple Columns_Apply

Python：使用滚动+；与需要两列作为参数的函数一起应用

python pandas function

Python：使用滚动+；与需要两列作为参数的函数一起应用,python,pandas,function,multiple-columns,apply,Python,Pandas,Function,Multiple Columns,Apply,我有一个数据帧（df），有两列： Out[2]: 0 1 0 1 2 1 4 5 2 3 6 3 10 12 4 1 2 5 4 5 6 3 6 7 10 12 我想对df[0]的所有元素使用calculate，这是一个函数本身和df[1]列： def custom_fct_2(x,y): res=stats.percentileofscore(y.values,x.iloc[-1]) return res

我有一个数据帧（df），有两列：

我想对df[0]的所有元素使用calculate，这是一个函数本身和df[1]列：

def custom_fct_2(x,y):
    res=stats.percentileofscore(y.values,x.iloc[-1])
    return res

我得到以下错误：TypeError：

("'numpy.float64' object is not callable", u'occurred at index 0')

以下是完整的代码：

from __future__ import division
import pandas as pd
import sys
from scipy import stats

def custom_fct_2(x,y):
    res=stats.percentileofscore(y.values,x.iloc[-1])
    return res

df= pd.DataFrame([[1,2],[4,5],[3,6],[10,12],[1,2],[4,5],[3,6],[10,12]])
df['perc']=df.rolling(3).apply(custom_fct_2(df[0],df[1]))

有人能帮我吗？（我是Python新手）

这里的问题是

rolling（）.apply（）

函数不能在所有列中为您提供3行的段。相反，它首先为列0提供序列，然后为列1提供序列

也许有更好的解决方案，但我会展示我的一个至少有效的解决方案

df= pd.DataFrame([[1,2],[4,5],[3,6],[10,12],[1,2],[4,5],[3,6],[10,12]])

def custom_fct_2(s):
  score = df[0][s.index.values[1]]  # you may use .values[-1] if you want the last element
  a = s.values
  return stats.percentileofscore(a, score)

我使用的是你提供的数据。但是我修改了您的

自定义\u fct\u 2（）

函数。这里我们得到了

，这是第1列中的一系列3个滚动值。幸运的是，我们在这个系列中有索引，因此我们可以通过系列的“中间”索引从列0中获得分数。顺便说一句，在Python中，

[-1]

表示集合的最后一个元素，但根据您的解释，我相信您实际上想要中间的元素

然后，应用该函数

# remove the shift() function if you want the value align to the last value of the rolling scores
df['prec'] = df[1].rolling(3).apply(custom_fct_2).shift(periods=-1)

shift

功能是可选的。您的

prec

是否需要与第0列（使用中间分数）或第1列的滚动分数对齐取决于您的要求。我假设您需要它。

为什么应用，只调用func应该很好

custom_fct_2（df[0]，df[1]）

Hi，因为下一步是“滚动”：b=df.Rolling（3）。应用（custom_fct_2（df[0]，df[1]）它就像滚动百分位数。您的预期输出是什么？df[0]列中每个x的向量[8,1]（序列或列表）我想要df[1]中最后3个数据中x的百分位数。

# remove the shift() function if you want the value align to the last value of the rolling scores
df['prec'] = df[1].rolling(3).apply(custom_fct_2).shift(periods=-1)