Python 作用于数据透视框的数据透视表
我有一个数据帧,它已经被旋转,看起来像下面这样Python 作用于数据透视框的数据透视表,python,pandas,pivot,pivot-table,Python,Pandas,Pivot,Pivot Table,我有一个数据帧,它已经被旋转,看起来像下面这样 Cost Transport Currency Manufacturer ABC XYZ ABC XYZ ABC XYZ Date 2017-07-01 312 323 31 41 Pounds Pounds 2017-07-02
Cost Transport Currency
Manufacturer ABC XYZ ABC XYZ ABC XYZ
Date
2017-07-01 312 323 31 41 Pounds Pounds
2017-07-02 423 335 21 32 Dollars Pounds
2017-07-03 421 304 21 21 Dollars Pounds
上面显示了与从制造商处购买物品相关的成本和运输费用,以及成本和费用的计价货币
我想做的是把这些数字加起来,放在货币下面。所需的输出是(我未对加法进行评估,以便清楚它来自何处)
我试过了
df.pivot_table(index='Date', columns='Currency', aggfunc=np.sum)
熊猫根本不喜欢给出一个键错误
下面是获取起始数据帧df的代码。在实际用例中,数据绝对需要首先进行数据透视以进行分析和聚合,因此请不要建议在my_列表或df_raw上应用数据透视表
my_list = ["2017-07-01", "ABC",312, 31, "Pounds", "2017-07-01", "XYZ" ,323, 41, "Pounds",
"2017-07-02", "ABC", 423, 21, "Dollars", "2017-07-02", "XYZ" ,335, 32, "Pounds",
"2017-07-03", "ABC", 421, 21, "Dollars", "2017-07-03", "XYZ", 304, 21, "Pounds" ]
df_raw = pd.DataFrame(np.array(my_list).reshape(6,5),
columns = ["Date", "Manufacturer", "Cost", "Transport", "Currency"])
df = df_raw.pivot(index='Date', columns='Manufacturer')
编辑2:修订
df2 = df.stack()
df2['total'] = df2['Cost'] + df2['Transport']
df2.reset_index(inplace = True)
df2.pivot_table(index = 'Date', columns = 'Currency', values = 'total', aggfunc = np.sum, fill_value = 0)
编辑:下面的答案对于你所看到的事实上是不可接受的。将尝试修改
单程
df_raw['total_cost'] = df_raw['Cost'] + df_raw['Transport']
df_raw.pivot_table(index = 'Date', columns = 'Currency', values = ['total_cost'], aggfunc = 'sum', fill_value = 0)
使用
stack
,groupby
,sum
,unstack
:
使用您的设置和输入数据框:
my_list = ["2017-07-01", "ABC",312, 31, "Pounds", "2017-07-01", "XYZ" ,323, 41, "Pounds",
"2017-07-02", "ABC", 423, 21, "Dollars", "2017-07-02", "XYZ" ,335, 32, "Pounds",
"2017-07-03", "ABC", 421, 21, "Dollars", "2017-07-03", "XYZ", 304, 21, "Pounds" ]
df_raw = pd.DataFrame(np.array(my_list).reshape(6,5),
columns = ["Date", "Manufacturer", "Cost", "Transport", "Currency"])
df = df_raw.pivot(index='Date', columns='Manufacturer')
df = df.apply(pd.to_numeric,errors='ignore')
重塑数据帧并计算:
df.stack().groupby(['Date','Currency']).sum().sum(1).unstack(fill_value=0)
输出:
Currency Dollars Pounds
Date
2017-07-01 0 707
2017-07-02 444 367
2017-07-03 442 325
谢谢stack、groupby、sum和unstack的使用令人印象深刻。:-)
Currency Dollars Pounds
Date
2017-07-01 0 707
2017-07-02 444 367
2017-07-03 442 325