Python 透视数据帧以获得正确顺序的结果数据帧
我有以下格式的excel数据:Python 透视数据帧以获得正确顺序的结果数据帧,python,pandas,pivot-table,Python,Pandas,Pivot Table,我有以下格式的excel数据: Original Data Frame Package FISCAL_YR SCENARIO PERIOD USD_AMT LY_USD_AMT CY_NetSales LY_NetSales Canada 2021 Plan Per01 1.00 2.00 3.00 4.00 Afric
Original Data Frame
Package FISCAL_YR SCENARIO PERIOD USD_AMT LY_USD_AMT CY_NetSales LY_NetSales
Canada 2021 Plan Per01 1.00 2.00 3.00 4.00
Africa 2021 Actual Per04 1.00 2.00 3.00 4.00
Africa 2021 Actual Per09 1.00 2.00 3.00 4.00
Brazil 2021 Plan Per11 1.00 2.00 3.00 4.00
Brazil 2021 Actual Per05 1.00 2.00 3.00 4.00
Africa 2021 Actual Per07 1.00 2.00 3.00 4.00
Mexico 2021 Plan Per10 1.00 2.00 3.00 4.00
Canada 2021 Actual Per02 1.00 2.00 3.00 4.00
为了简化我的计算,我尝试将scenario列中的值与最后4列中的值进行适当分配:
Expected dataframe:
Actual Plan
Package Sum of USD_AMT Sum of LY_USD_AMT Sum of CY_NetSales Sum of LY_NetSales Sum of USD_AMT Sum of LY_USD_AMT Sum of CY_NetSales Sum of LY_NetSales
Africa 3 6 9 12
Brazil 2 4 6 8
Canada 1 2 3 4 1 2 3 4
Mexico 1 2 3 4
我正在pandas中尝试此透视表选项,但它正在呈现以下输出:
失败的解决方案:
pd_piv=pd.pivot_table(df_dummy,index=['Package', 'FISCAL_YR', 'PERIOD'],
columns=['SCENARIO'],
values=['USD_AMT', 'LY_USD_AMT', 'CY_NetSales', 'LY_NetSales'], aggfunc=np.sum, fill_value=0)
pd_piv.head()
CY_NetSales LY_NetSales LY_USD_AMT USD_AMT
SCENARIO Actual Plan WRKG_FCST Actual Plan WRKG_FCST Actual Plan WRKG_FCST Actual Plan WRKG_FCST
Package_SubCategory FISCAL_YR_NBR FISCAL_PERIOD_NBR
*由于实际数据大不相同,因此未显示数字
是否仍然可以获得上面所示的预期数据帧?也许这就是您要寻找的: 1-制作透视表:
import pandas as pd
import numpy as np
data={"package":["Canada","Africa","Africa","Brazil","Brazil","Africa","Mexico","Canada"],
"scenario":["Plan","Actual","Actual","Plan","Actual","Actual","Plan","Actual"],
"USD_AMT":[1,1,1,1,1,1,1,1,],
"LY_USD_AMT":[1,1,1,1,1,1,1,1,]}
df=pd.DataFrame(data)
pd_piv=pd.pivot_table(df,index=['package'],
columns=['scenario'],
values=['USD_AMT', 'LY_USD_AMT',], aggfunc=np.sum,fill_value=0)
结果:
LY_USD_AMT USD_AMT
scenario Actual Plan Actual Plan
package
Africa 3 0 3 0
Brazil 1 1 1 1
Canada 1 1 1 1
Mexico 0 1 0 1
2-交换索引级别:
pd_piv.columns=pd_piv.columns.swaplevel(0, 1)
pd_piv.sort_index(axis=1, level=0, inplace=True)
最终结果:
scenario Actual Plan
LY_USD_AMT USD_AMT LY_USD_AMT USD_AMT
package
Africa 3 3 0 0
Brazil 1 1 1 1
Canada 1 1 1 1
Mexico 0 0 1 1