Python熊猫：如何（复杂地）修改基于另一列的一列？_Python_Pandas_Group By

Python熊猫：如何（复杂地）修改基于另一列的一列？

python pandas

Python熊猫：如何（复杂地）修改基于另一列的一列？,python,pandas,group-by,Python,Pandas,Group By,我有预订数据，而当客户关联、更改、删除或重新激活订单时，会插入新行。交付显示产品是否实际交付，通常情况下，如果订单在上次更新中未被删除以下是一些示例代码： df = pd.DataFrame( { "booking id": [1,1,1,2,2,2,3,3,4,4,4], "booking type": ["initiation", "change", "change"

我有预订数据，而当客户关联、更改、删除或重新激活订单时，会插入新行。交付显示产品是否实际交付，通常情况下，如果订单在上次更新中未被删除

以下是一些示例代码：

df = pd.DataFrame(
    {
    "booking id": [1,1,1,2,2,2,3,3,4,4,4],
    "booking type": ["initiation", "change", "change", "initiation", "change", "deletion", "reactivation", "change", "initiation", "change", "deletion"],
    "delivered": ["yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "no", "no", "no"]
    }
)

有些数据不正确。如果预订id的上一次更新最后一行的预订类型=删除，则此预订id的所有行都应已交付=否

在本例中，我正在寻找以下内容：

df = pd.DataFrame(
    {
    "booking id": [1,1,1,2,2,2,3,3,4,4,4],
    "booking type": ["initiation", "change", "change", "initiation", "change", "deletion", "reactivation", "change", "initiation", "change", "deletion"],
    "delivered": ["yes", "yes", "yes", "no", "no", "no", "yes", "yes", "no", "no", "no"]
    }
)

我该怎么做？非常感谢

对last使用transform，然后将其分配回

df.loc[df.groupby('booking id')['booking type'].transform('last').eq('deletion'),'delivered']='No'
df
Out[112]: 
    booking id  booking type delivered
0            1    initiation       yes
1            1        change       yes
2            1        change       yes
3            2    initiation        No
4            2        change        No
5            2      deletion        No
6            3  reactivation       yes
7            3        change       yes
8            4    initiation        No
9            4        change        No
10           4      deletion        No

将变换与last一起使用，然后将其重新指定

df.loc[df.groupby('booking id')['booking type'].transform('last').eq('deletion'),'delivered']='No'
df
Out[112]: 
    booking id  booking type delivered
0            1    initiation       yes
1            1        change       yes
2            1        change       yes
3            2    initiation        No
4            2        change        No
5            2      deletion        No
6            3  reactivation       yes
7            3        change       yes
8            4    initiation        No
9            4        change        No
10           4      deletion        No

以下是一种使用and的方法：

使用groupby可能有更好的方法，但我不知道这种方法。我能想到的最好的方法是使用.loc，您可以找到它的引用

基本上，loc返回一段匹配某些特征的数据帧

首先，使用loc获取所有具有任何删除预订类型的ID。其次，循环使用这些ID，并将所有这些ID更改为“否”。使用groupby可能有更好的方法，但我不知道这种方法。我能想到的最好的方法是使用.loc，您可以找到它的引用

基本上，loc返回一段匹配某些特征的数据帧

首先，使用loc获取所有具有任何删除预订类型的ID。第二，循环检查这些ID，并将所有ID更改为“否”。

Hi@Julian don’忘了你可以向上投票并接受答案，请看，谢谢！刚刚做了，很抱歉回复晚了！嗨@Julian别忘了你可以投票并接受答案，看，谢谢！刚刚做了，很抱歉回复晚了！这和雅图的解决方案都起了作用；只是这个对我来说更快。谢谢！如果你能解释一下为什么会这样，那就太好了！这和雅图的解决方案都起了作用；只是这个对我来说更快。谢谢！如果你能解释一下为什么会这样，那就太好了！

ids_to_change = df.loc[df['booking type'] == 'deletion', :]['booking id']

for id in ids_to_change:
   df.loc[df['booking id'] == id, 'delivered'] = 'no'