Python “熊猫多”;分组方式;并比较不同列中的值

Python “熊猫多”;分组方式;并比较不同列中的值,python,pandas,Python,Pandas,我有一个数据集: In: import pandas as pd df = pd.DataFrame({'id': [23, 23, 23, 43, 43], 'data_1': ['20170503', '20170503', '20170503', '20170602', '20170602'], 'units' : [10,10,10,5,5],

我有一个数据集:

In:
import pandas as pd

df = pd.DataFrame({'id': [23, 23, 23, 43, 43],
               'data_1': ['20170503', '20170503', '20170503', '20170602',                   
               '20170602'],
               'units' : [10,10,10,5,5],
               'data_2' : ['20170104', '20170503', '20170503', '20170605', 
               '20170602'],
               'code': ["s", "r", "s", "s", "r"],
               'units_2': [20,10, 10, 8, 5 ]})

print(df)
输出:

我需要按“id”分组,并检查date_2和“s”中是否有与date_1相对应的日期。可以添加一个附加列来勾选这些匹配项,因此最终输出如下所示:

   id    data_1     units    data_2    code  units_2     new_column
0  23  20170503     10     20170104       s       20              0
1  23  20170503     10     20170503       r       10              0
2  23  20170503     10     20170503       s       10              1
3  43  20170602      5     20170605       s        8              0
4  43  20170602      5     20170602       r        5              0

感谢您提供的任何帮助

此处不需要
groupby
,因为值不会更改或按组计数

使用:


我认为groupby需要比较相同id的数据。anky_91,非常感谢。它很有效,而且非常有用!。我只是想向前走几步,但事实上,这一步所需的一切都可以用您的解决方案来解决:)
   id    data_1     units    data_2    code  units_2     new_column
0  23  20170503     10     20170104       s       20              0
1  23  20170503     10     20170503       r       10              0
2  23  20170503     10     20170503       s       10              1
3  43  20170602      5     20170605       s        8              0
4  43  20170602      5     20170602       r        5              0
df['new_column']=(df.data_1.eq(df.data_2)&df.code.eq('s')).astype(int)
# or df['new_column']=(df.data_1.eq(df.data_2)&df.code.eq('s')).map({True:1,False:0})
# or df['new_column'] = np.where((df.data_1.eq(df.data_2)&df.code.eq('s')),1,0)
print(df)

   id    data_1  units    data_2 code  units_2  new_column
0  23  20170503     10  20170104    s       20           0
1  23  20170503     10  20170503    r       10           0
2  23  20170503     10  20170503    s       10           1
3  43  20170602      5  20170605    s        8           0
4  43  20170602      5  20170602    r        5           0