Python 向lambda或函数添加计算_Python_Python 3.x_Pandas_Data Science

Python 向lambda或函数添加计算

python python-3.x pandas

Python 向lambda或函数添加计算,python,python-3.x,pandas,data-science,Python,Python 3.x,Pandas,Data Science,我有一个df，比如 ID | Status | Color 555 Cancelled Green 434 Processed Red 212 Cancelled Blue 121 Cancelled Green 242 Cancelled Blue 352 Processed Green 343 Processed Blue 我使

我有一个df，比如

    ID |    Status   | Color
   555    Cancelled     Green
   434    Processed     Red   
   212    Cancelled     Blue
   121    Cancelled     Green
   242    Cancelled     Blue
   352    Processed     Green
   343    Processed     Blue

我使用的代码如下：

cc = df.groupby(by='Color').ID.count()
df.groupby(by=['Color', 'Status']).apply(lambda x: len(x)/cc.loc[x.Color.iloc[0]])

这将为我提供如下输出：

Color     Status   
Blue   Cancelled    0.666667
       Processed    0.333333
Green  Cancelled    0.666667
       Processed    0.333333
Red    Processed    1.000000
dtype: float64

这给了我每种颜色状态的百分比

还有一个名为dollar\u value的字段，其中每行包含美元金额，如果我想在输出的1中添加两个字段。对于该颜色和状态，2。对于该颜色，每订单的美元金额（这意味着，如果总金额为1000美元，并且该颜色和状态有200行，那么它将是1000/200或5。我可以轻松地将这两个计算添加到我已有的输出中吗？或者我需要创建一个函数吗
期望输出：

Color Status Total |Dollar_Per_Order Blue Cancelled 0.666667 1000 20 Processed 0.333333 200 5 Green Cancelled 0.666667 2000 20 Processed 0.333333 1000 5 Red Processed 1.000000 300 10 dtype: float64

谢谢！
要计算所有3列，请将应用于每个组的函数定义为：

def fn(grp): total = grp.dollar_value.sum() rowNo = len(grp.index) return pd.Series([ rowNo/cc[grp.name[0]], total, total/rowNo ], index=[ 'Percentage', 'Total_Dollars', 'Dollar_per_order'])
然后应用它：

df.groupby(by=['Color', 'Status']).apply(fn)
请注意，我使用了
len（grp.index）
而不是
len（grp）
。原因是它跑得快一点

我也以不同于您的方式读取当前组的颜色。
美元值从何而来？？？
total\u dolar
=
groupby（）.sum（）
，
dolar\u per\u order
=
groupby（）.mean（）
？df中有另一列，每行只有美元值