Python 如果某些列匹配,并且某些列不同,如何连接行
所以我有一个数据框,看起来像这样Python 如果某些列匹配,并且某些列不同,如何连接行,python,pandas,Python,Pandas,所以我有一个数据框,看起来像这样 time event nflId_WR position_WR team_WR x,y_WR 0 2018-09-07T01:07:18.099Z pass_forward 2495454.0 WR away (80.69, 44.91) 1 2018-09-07T01:07:18.099Z pass_forward 2533040.0 WR away (82.65, 34
time event nflId_WR position_WR team_WR x,y_WR
0 2018-09-07T01:07:18.099Z pass_forward 2495454.0 WR away (80.69, 44.91)
1 2018-09-07T01:07:18.099Z pass_forward 2533040.0 WR away (82.65, 34.56)
2 2018-09-07T01:07:18.099Z pass_forward 2552689.0 CB home (79.51, 20.0)
3 2018-09-07T01:07:18.099Z pass_forward 2555383.0 CB home (76.53, 44.93)
4 2018-09-07T01:07:19.200Z pass_arrived 2495454.0 WR away (81.11, 47.87)
我试图将“position”列不同的行带到“time”列相同而“event”相同的行
time event_WR nflId_WR position_WR team_WR x,y_WR nflId_CB position_CB team_CB x,y_CB
0 2018-09-07T01:07:18.099Z pass_forward 2495454.0 WR away (80.69, 44.91) 2552689.0 CB home (79.51, 20.0)
1 2018-09-07T01:07:18.099Z pass_forward 2533040.0 WR away (82.65, 34.56) 2495454.0 WR away (81.11, 47.87)
类似这样(很抱歉,列标题没有对齐。不太确定如何在此处正确设置格式)
有没有关于如何做到这一点的建议
此外,如果您知道在进行叠加之前进行过滤的任何方法,以查看哪对CB和WR具有最接近的“y”点,以便它们配对在一起,那将是非常好的
我试图将“position”列不同的行带到“time”列相同而“event”相同的行
time event_WR nflId_WR position_WR team_WR x,y_WR nflId_CB position_CB team_CB x,y_CB
0 2018-09-07T01:07:18.099Z pass_forward 2495454.0 WR away (80.69, 44.91) 2552689.0 CB home (79.51, 20.0)
1 2018-09-07T01:07:18.099Z pass_forward 2533040.0 WR away (82.65, 34.56) 2495454.0 WR away (81.11, 47.87)
您可以尝试以下方法:
merged = df.join(df, on=['time', 'event'], rsuffix='_other')
merged = merged[merged.position != merged.position_other]
这将首先在两列上合并,然后删除位置相同的行。我已经做了这么久了,谢谢!你知道每个时间分组的一种方法吗?比较位置不同的y值,然后匹配y中最接近的球员。看一看?如果它没有回答,你能问一个新问题吗?
time event_WR nflId_WR position_WR team_WR x,y_WR nflId_CB position_CB team_CB x,y_CB
0 2018-09-07T01:07:18.099Z pass_forward 2495454.0 WR away (80.69, 44.91) 2552689.0 CB home (79.51, 20.0)
1 2018-09-07T01:07:18.099Z pass_forward 2533040.0 WR away (82.65, 34.56) 2495454.0 WR away (81.11, 47.87)