Python 外部联接将右侧缺少的值添加为零或NaN
我正试图以一种非常具体的方式合并两只熊猫 DF1= DF2=Python 外部联接将右侧缺少的值添加为零或NaN,python,pandas,merge,outer-join,Python,Pandas,Merge,Outer Join,我正试图以一种非常具体的方式合并两只熊猫 DF1= DF2= Name Shape Value Tom Circle 4 Tom Square 4 Frank Triangle 7 John Square 2 Sarah Circle 1 期望结果=DFM= Name Color Shape Value Tom Blue Circle 4 Frank Red
Name Shape Value
Tom Circle 4
Tom Square 4
Frank Triangle 7
John Square 2
Sarah Circle 1
期望结果=DFM=
Name Color Shape Value
Tom Blue Circle 4
Frank Red Circle 0
John Green Circle 0
Sarah Red Circle 1
Tom Blue Square 4
Frank Red Square 0
John Green Square 2
Sarah Red Square 0
Tom Blue Triangle 0
Frank Red Triangle 7
John Green Triangle 0
Sarah Red Triangle 0
外部连接提供了我想要的大部分,但不是零。寻找一个优雅的方法来做到这一点的一些见解。零或简单的NaN是可以的,因为我可以用零替换NaN。您可以使用
合并和取消堆栈
s=df2.merge(df1,on='Name',how='outer')
s.set_index(['Name','Color','Shape']).Value.unstack(-1,fill_value=0).stack().reset_index().sort_values(['Shape','Name'])
Out[263]:
Name Color Shape 0
0 Frank Red Circle 0
3 John Green Circle 0
6 Sarah Red Circle 1
9 Tom Blue Circle 4
1 Frank Red Square 0
4 John Green Square 2
7 Sarah Red Square 0
10 Tom Blue Square 4
2 Frank Red Triangle 7
5 John Green Triangle 0
8 Sarah Red Triangle 0
11 Tom Blue Triangle 0
您可以使用merge
和unstack
s=df2.merge(df1,on='Name',how='outer')
s.set_index(['Name','Color','Shape']).Value.unstack(-1,fill_value=0).stack().reset_index().sort_values(['Shape','Name'])
Out[263]:
Name Color Shape 0
0 Frank Red Circle 0
3 John Green Circle 0
6 Sarah Red Circle 1
9 Tom Blue Circle 4
1 Frank Red Square 0
4 John Green Square 2
7 Sarah Red Square 0
10 Tom Blue Square 4
2 Frank Red Triangle 7
5 John Green Triangle 0
8 Sarah Red Triangle 0
11 Tom Blue Triangle 0