Python 如何在Pandas中连接多索引
我有两个这样的数据帧:Python 如何在Pandas中连接多索引,python,pandas,Python,Pandas,我有两个这样的数据帧: df1 ID Value1 Amount2 1 100 10 2 400 20 3 300 50 我想在这两张桌子上找一张这样的桌子 Desired Output: ID Value Amount Difference_Value Difference_Amount df1 df2
df1
ID Value1 Amount2
1 100 10
2 400 20
3 300 50
我想在这两张桌子上找一张这样的桌子
Desired Output:
ID Value Amount Difference_Value Difference_Amount
df1 df2 df1 df2
1 100 0 10 0 100 10
2 400 200 20 20 200 0
3 300 300 50 30 0 20
我对多级索引有点陌生。我知道这是可能的,但没有发现其他问题有助于我的需要
我需要将此Value、Amount、Difference\u Value和Difference\u Amount
列合并到excel中的单元格中,以便了解这一点
谢谢。如果所有列的
多索引
都可以:
首先将ID
转换为index by、subtract by和join with by,最后使用更改MultiIndex
,并且:
如果尝试将MultiIndex
和noMultiIndex
Dataframes连接在一起,则改为获取元组MultiIndex
:
df1 = df1.set_index('ID')
df2 = df2.set_index('ID')
df3 = df1.sub(df2, fill_value=0)
df = (pd.concat([df1, df2, df3], axis=1, keys=(['df1','df2']))
.swaplevel(1,0, axis=1)
.fillna(0)
.sort_index(axis=1)
.join(df3.add_prefix('Diff_')))
print (df)
(Amount2, df1) (Amount2, df2) (Value1, df1) (Value1, df2) Diff_Value1 \
ID
1 10 0.0 100 0.0 100.0
2 20 20.0 400 200.0 200.0
3 50 30.0 300 300.0 0.0
Diff_Amount2
ID
1 10.0
2 0.0
3 20.0
您可以尝试使用,然后使用在列中拆分
用于指定差值
d = df.merge(df1,how='outer',on='ID',suffixes=('-df1','-df2')
).fillna(0)
d
ID Value1-df1 Amount2-df1 Value1-df2 Amount2-df2
0 1 100 10 0.0 0.0
1 2 400 20 200.0 20.0
2 3 300 50 300.0 30.0
d = d.assign(diff_value = d['Value1-df1'].sub(d['Value1-df2']),
diff_amount = d['Amount2-df1'].sub(d['Amount2-df2'])).set_index('ID')
d
Value1-df1 Amount2-df1 Value1-df2 Amount2-df2 diff_value diff_amount
ID
1 100 10 0.0 0.0 100.0 10.0
2 400 20 200.0 20.0 200.0 0.0
3 300 50 300.0 30.0 0.0 20.0
现在,使用expand=True
拆分'-'
处的列,以获取多索引
,然后使用
df1 = df1.set_index('ID')
df2 = df2.set_index('ID')
df3 = df1.sub(df2, fill_value=0)
df = (pd.concat([df1, df2, df3], axis=1, keys=(['df1','df2']))
.swaplevel(1,0, axis=1)
.fillna(0)
.sort_index(axis=1)
.join(df3.add_prefix('Diff_')))
print (df)
(Amount2, df1) (Amount2, df2) (Value1, df1) (Value1, df2) Diff_Value1 \
ID
1 10 0.0 100 0.0 100.0
2 20 20.0 400 200.0 200.0
3 50 30.0 300 300.0 0.0
Diff_Amount2
ID
1 10.0
2 0.0
3 20.0
d = df.merge(df1,how='outer',on='ID',suffixes=('-df1','-df2')
).fillna(0)
d
ID Value1-df1 Amount2-df1 Value1-df2 Amount2-df2
0 1 100 10 0.0 0.0
1 2 400 20 200.0 20.0
2 3 300 50 300.0 30.0
d = d.assign(diff_value = d['Value1-df1'].sub(d['Value1-df2']),
diff_amount = d['Amount2-df1'].sub(d['Amount2-df2'])).set_index('ID')
d
Value1-df1 Amount2-df1 Value1-df2 Amount2-df2 diff_value diff_amount
ID
1 100 10 0.0 0.0 100.0 10.0
2 400 20 200.0 20.0 200.0 0.0
3 300 50 300.0 30.0 0.0 20.0
d.columns = d.columns.str.split('-',expand=True) #expand= True makes it MultiIndex
d.sort_index(axis=1)
Amount2 Value1 diff_amount diff_value
df1 df2 df1 df2 NaN NaN
ID
1 10 0.0 100 0.0 10.0 100.0
2 20 20.0 400 200.0 0.0 200.0
3 50 30.0 300 300.0 20.0 0.0