将列传递给具有多个条件的函数-Python_Python_Pandas_Numpy

将列传递给具有多个条件的函数-Python

python pandas numpy

将列传递给具有多个条件的函数-Python,python,pandas,numpy,Python,Pandas,Numpy,我创建了一个函数，根据客户的年度购买历史记录将客户分配到“bucket”。当我在（curryear，last year）中传递单个值时，该函数按预期运行。如何传递curryear、lastyear中两个独立列中的所有值当我尝试以下方法时，我收到 ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 我的代码： #FUNCTION FO

我创建了一个函数，根据客户的年度购买历史记录将客户分配到“bucket”。当我在（curryear，last year）中传递单个值时，该函数按预期运行。如何传递curryear、lastyear中两个独立列中的所有值

当我尝试以下方法时，我收到

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我的代码：

#FUNCTION FOR CATEGORIZING ANNUAL CUSTOMER PURCHASE BEHAVIOR
def bucket(curryear, lastyear):
    if ((lastyear > 0) & (curryear <= 0)):
        return 'Attrition'
    elif ((lastyear > curryear) & (curryear > 0)):
        return 'Organic Attrition'
    elif ((lastyear <= 0) & (curryear > 0)):
        return 'New Sales'
    elif ((curryear > lastyear) & (lastyear > 0)):
        return 'Organic Growth'
    elif ((lastyear == 0) & (curryear == 0)):
        return 'None'
    else:
        return 'Flat'

bucket(df['2019'],df['2018'])

#用于对年度客户购买行为进行分类的功能
def桶（当前年份、去年）：
如果（（去年>0）和（当年）和（当年>0））：
返回“有机损耗”
elif（（去年0））：
返回“新销售”
elif（（当前年份>去年）和（去年>0））：
回归“有机增长”
elif（（去年==0）和（当前==0））：
返回“无”
其他：
返回“单位”
桶（df['2019'，df['2018']）

以下是我正在使用的数据示例：

该错误基本上说明了产生错误的确切原因（对整个列测试

>0

之类的内容是不明确的，因为您可能要检查每个值是否都大于0或该列中只有一个值）。您可以将编写的函数按行应用于各个值，如下所示：

def bucket(curryear, lastyear):
    if ((lastyear > 0) & (curryear <= 0)):
        return 'Attrition'
    elif ((lastyear > curryear) & (curryear > 0)):
        return 'Organic Attrition'
    elif ((lastyear <= 0) & (curryear > 0)):
        return 'New Sales'
    elif ((curryear > lastyear) & (lastyear > 0)):
        return 'Organic Growth'
    elif ((lastyear == 0) & (curryear == 0)):
        return 'None'
    else:
        return 'Flat'

df["bucket"] = df.apply(lambda x: bucket(x["2019"], x["2018"]), axis=1)

def桶（当前年份、去年）：
如果（（去年>0）和（当年）和（当年>0））：
返回“有机损耗”
elif（（去年0））：
返回“新销售”
elif（（当前年份>去年）和（去年>0））：
回归“有机增长”
elif（（去年==0）和（当前==0））：
返回“无”
其他：
返回“单位”
df[“bucket”]=df.apply（λx:bucket（x[“2019”]，x[“2018”]），轴=1）

基本上，pandas有一个名为“apply”的函数，它支持lambda函数

df['bucket']=df.apply（lambda x:bucket（x.2019，x.2018），axis=1）

重写函数，使其可以在列上并行化：

def bucket(curryear, lastyear):
    ly_pos, cy_pos = lastyear > 0, curryear > 0
    out = np.select( (ly_pos & (~cy_pos), (lastyear > curryear) & cy_pos,
                      (~ly_pos) & cy_pos, (curryear>lastyear)&ly_pos,
                      (lastyear==0) & (curryear==0)
                     ),
                     ('Attritrion', 'Organic Attrition',
                      'New Sales', 'Organic Growth', 'None'),
                    'Flat'
                   )
    return out

bucket(df['2019'], df['2018'])

@马贝格克斯这起作用了——非常感谢。这就是我执行它的方式，我意识到这可能是低效的

df['2013 Bucket'] = df.apply(lambda x: bucket(x["2013"], x["2012"]), axis=1)
df['2014 Bucket'] = df.apply(lambda x: bucket(x["2014"], x["2013"]), axis=1)
df['2015 Bucket'] = df.apply(lambda x: bucket(x["2015"], x["2014"]), axis=1)
df['2016 Bucket'] = df.apply(lambda x: bucket(x["2016"], x["2015"]), axis=1)
df['2017 Bucket'] = df.apply(lambda x: bucket(x["2017"], x["2016"]), axis=1)
df['2018 Bucket'] = df.apply(lambda x: bucket(x["2018"], x["2017"]), axis=1)
df['2019 Bucket'] = df.apply(lambda x: bucket(x["2019"], x["2018"]), axis=1)