Python 如何在多个条件和groupby下使用.loc
我有一个按“键”分组的df。我想标记组中的任何行,其中出院日期与另一个出院日期匹配,并且在这些行之间,其中一行的num1值在5-12之间。发现了类似的问题,但没有意识到多重条件的复杂性Python 如何在多个条件和groupby下使用.loc,python,pandas,Python,Pandas,我有一个按“键”分组的df。我想标记组中的任何行,其中出院日期与另一个出院日期匹配,并且在这些行之间,其中一行的num1值在5-12之间。发现了类似的问题,但没有意识到多重条件的复杂性 df = pd.DataFrame({'Key': ['10003', '10003', '10003', '10003', '10003','10003','10034', '10034'], 'Num1': [12,13,13,13,12,13,16,13], 'Num2':
df = pd.DataFrame({'Key': ['10003', '10003', '10003', '10003', '10003','10003','10034', '10034'],
'Num1': [12,13,13,13,12,13,16,13],
'Num2': [121,122,122,124,125,126,127,128],
'admit': [20120506, 20120508, 20121010,20121010,20121010,20121110,20120516,20120520],
'discharge': [20120508, 20120508, 20121012,20121016,20121023,20121111,20120520,20120520]})
df['admit'] = pd.to_datetime(df['admit'], format='%Y%m%d')
df['discharge'] = pd.to_datetime(df['discharge'], format='%Y%m%d')
初始df:
Key Num1 Num2 admit discharge
0 10003 12 121 2012-05-06 2012-05-08
1 10003 13 122 2012-05-08 2012-05-08
2 10003 13 122 2012-10-10 2012-10-12
3 10003 13 124 2012-10-10 2012-10-16
4 10003 12 125 2012-10-10 2012-10-23
5 10003 13 126 2012-11-10 2012-11-11
6 10034 16 127 2012-05-16 2012-05-20
7 10034 13 128 2012-05-20 2012-05-20
最终测向
Key Num1 Num2 admit discharge flag
0 10003 12 121 2012-05-06 2012-05-08 1
1 10003 13 122 2012-05-08 2012-05-08 1
2 10003 13 122 2012-10-10 2012-10-12 0
3 10003 13 124 2012-10-10 2012-10-16 0
4 10003 12 125 2012-10-10 2012-10-23 0
5 10003 13 126 2012-11-10 2012-11-11 0
6 10034 16 127 2012-05-16 2012-05-20 0
7 10034 13 128 2012-05-20 2012-05-20 0
num1_range = [5,6,7,8,9,10,11,12]
df.loc[df.groupby('Key').apply(lambda x : x.duplicated(subset='discharge',keep=False)).values,'flag']=1
您可以通过使用
过滤器来实现这一点
df.loc[df.groupby(['Key','discharge']).Num1.filter(lambda x : (x.isin(num1_range).any())&(len(x)>1)).index,'flag']=1
df
Out[317]:
Key Num1 Num2 admit discharge flag
0 10003 12 121 2012-05-06 2012-05-08 1.0
1 10003 13 122 2012-05-08 2012-05-08 1.0
2 10003 13 122 2012-10-10 2012-10-12 NaN
3 10003 13 124 2012-10-10 2012-10-16 NaN
4 10003 12 125 2012-10-10 2012-10-23 NaN
5 10003 13 126 2012-11-10 2012-11-11 NaN
6 10034 16 127 2012-05-16 2012-05-20 NaN
7 10034 13 128 2012-05-20 2012-05-20 NaN