Python允许在具有重复值的单元格上添加条件_Python_Pandas_Group By

Python允许在具有重复值的单元格上添加条件

python pandas

Python允许在具有重复值的单元格上添加条件,python,pandas,group-by,Python,Pandas,Group By,这是我上一个问题的后续问题我有一个数据帧： import pandas as pd df = pd.DataFrame({'First': ['Sam', 'Greg', 'Steve', 'Sam', 'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'], 'Last': ['Stevens', 'Hamcunning', 'Strange', '

这是我上一个问题的后续问题

我有一个数据帧：

import pandas as pd

df = pd.DataFrame({'First': ['Sam', 'Greg', 'Steve', 'Sam',
                             'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'],
                   'Last': ['Stevens', 'Hamcunning', 'Strange', 'Stevens',
                            'Vargas', 'Simon', 'Purple', 'Green', 'Simon', 'Simon'],
                   'Address': ['112 Fake St',
                               '13 Crest St',
                               '14 Main St',
                               '112 Fake St',
                               '2 Morningwood',
                               '7 Cotton Dr',
                               '14 Main St',
                               '20 Main St',
                               '7 Cotton Dr',
                               '7 Cotton Dr'],
                   'Status': ['Infected', '', 'Infected', '', '', '', '','', '', 'Infected'],
                   'Level': [10, 2, 7, 5, 2, 10, 10, 20, 1, 1],
                   })

让我们假设这次我想将状态值“infected”传播给同一地址内的每个人，并附加一个条件，例如如果他们在最后一个地址中具有相同的值。因此，结果如下所示：

df2 = df1.copy(deep=True)
df2['Status'] = ['Infected', '', 'Infected', 'Infected', '', 'Infected', '', '', 'Infected', 'Infected']

如果我希望该个人在同一地址但不在同一级别时被标记为已感染，该怎么办？结果将是：

df3 = df1.copy(deep=True)
df3['Status'] = ['Infected', '', 'Infected', '', '', 'Infected', '', '', '', 'Infected']

我该怎么做？这是groupby问题吗？

相同的地址用“groupby”表示

@ccsv您能更清楚地了解df3是关于什么的吗？我的代码标记感染遵循2条规则：1。已经感染2。地址相同但级别不同的人被感染当我运行代码时，

df3['Status']

列都是空的。@ccsv这很奇怪，我使用ipython的

%paste

进行测试，结果不全是空的。如果发现错误，我在其他地方声明了

df3

，但没有填写，很抱歉。

import pandas as pd


df=pd.DataFrame({'First': [ 'Sam', 'Greg', 'Steve', 'Sam',
                 'Jill', 'Bill', 'Nod', 'Mallory', 'Ping', 'Lamar'],
                 'Last': [ 'Stevens', 'Hamcunning', 'Strange', 'Stevens',
                 'Vargas', 'Simon', 'Purple', 'Green', 'Simon', 'Simon'],
                 'Address': ['112 Fake St','13 Crest St','14 Main St','112 Fake St','2 Morningwood','7 Cotton Dr','14 Main St','20 Main St','7 Cotton Dr','7 Cotton Dr'],
                 'Status': ['Infected','','Infected','','','','','','','Infected'],
                 'Level': [10,2,7,5,2,10,10,20,1,1],
                 })

df2_index = df.groupby(['Address', 'Last']).filter(lambda x: (x['Status'] == 'Infected').any()).index
df2 = df.copy()
df2.loc[df2_index, 'Status'] = 'Infected'

df3_status = df.groupby('Address', as_index=False, group_keys=False).apply(lambda x: pd.Series(list('Infected' if (row['Status'] == 'Infected') or ((x['Status'] == 'Infected') & (x['Level'] != row['Level'])).any() else '' for _, row in x.iterrows()), index=x.index))
df3 = df.copy()
df3['Status'] = df3_status