Pivot表中的Python多重计算
我想做一个有多重计算的数据透视表。例如,我的表最初看起来是这样的Pivot表中的Python多重计算,python,Python,我想做一个有多重计算的数据透视表。例如,我的表最初看起来是这样的 +---------+----------+----------+---------+------------+-----------+ | Region | Country | Vol_Sales| Weight | Acquisition| Processing| +---------+----------+----------+---------+------------+-----------+ | Asia
+---------+----------+----------+---------+------------+-----------+
| Region | Country | Vol_Sales| Weight | Acquisition| Processing|
+---------+----------+----------+---------+------------+-----------+
| Asia | Japan | 400 | 6 | Auto | Manual |
+---------+----------+----------+---------+------------+-----------+
| Asia | Singapore| 700 | 7 | Auto | Auto |
+---------+----------+----------+---------+------------+-----------+
| Europe | UK | 600 | 8 | Manual | Auto |
+---------+----------+----------+---------+------------+-----------+
| . | . | | | | |
+---------+----------+----------+---------+------------+-----------+
| . | . | | | | |
+---------+----------+----------+---------+------------+-----------+
| . | . | | | | |
+---------+----------+----------+---------+------------+-----------+
| Africa | Egypt | 700 | 7 | Auto | Auto |
+---------+----------+----------+---------+------------+-----------+
+---------+----------+----------+---------+------------+-----------+
|Region |Tot_Sales |Acq_Prop |Proc_Prop|Avg_Weight | |
+---------+----------+----------+---------+------------+-----------+
| Asia | 80,000 | 80.6 | 70.2 | 7.2 | |
+---------+----------+----------+---------+------------+-----------+
Total_Sales = Sum of Vol_Sales
Acq_Prop = Acquisition(Auto) / Total(Auto+Manual) * 100
Proc_Prop = Processing(Auto) / Total(Auto+Manual) * 100
Avg_Weights = Average of Weight group by Region
我希望轴的结果如下所示
+---------+----------+----------+---------+------------+-----------+
| Region | Country | Vol_Sales| Weight | Acquisition| Processing|
+---------+----------+----------+---------+------------+-----------+
| Asia | Japan | 400 | 6 | Auto | Manual |
+---------+----------+----------+---------+------------+-----------+
| Asia | Singapore| 700 | 7 | Auto | Auto |
+---------+----------+----------+---------+------------+-----------+
| Europe | UK | 600 | 8 | Manual | Auto |
+---------+----------+----------+---------+------------+-----------+
| . | . | | | | |
+---------+----------+----------+---------+------------+-----------+
| . | . | | | | |
+---------+----------+----------+---------+------------+-----------+
| . | . | | | | |
+---------+----------+----------+---------+------------+-----------+
| Africa | Egypt | 700 | 7 | Auto | Auto |
+---------+----------+----------+---------+------------+-----------+
+---------+----------+----------+---------+------------+-----------+
|Region |Tot_Sales |Acq_Prop |Proc_Prop|Avg_Weight | |
+---------+----------+----------+---------+------------+-----------+
| Asia | 80,000 | 80.6 | 70.2 | 7.2 | |
+---------+----------+----------+---------+------------+-----------+
Total_Sales = Sum of Vol_Sales
Acq_Prop = Acquisition(Auto) / Total(Auto+Manual) * 100
Proc_Prop = Processing(Auto) / Total(Auto+Manual) * 100
Avg_Weights = Average of Weight group by Region
到目前为止,我所做的是
Sales_Report = Sales_Report.assign(counter = 1)
Report = pd.pivot_table(Sales_Report, index = ['REGION'], columns = ['Acquisition', 'Processing'], values = ['counter'], aggfunc = 'sum', fill_value = 0, margins = True, margins_name = 'Total')
然后我重新安排桌子
create a list of the new column names in the right order
new_cols=[('{1} {2}'.format(*tup)) for tup in Table.columns]
assign it to the Table
Report.columns = new_cols
resort the index, so you get the columns in the order you specified
Report.sort_index(axis='columns')
最后得到计算出的列
Report['Acq_Prop'] = round((Report['Auto Auto'] + Report['Auto Manual']) / (Report['Auto Auto'] + Report['Auto Manual'] + Report['Manual Auto'] + Report['Manual Manual'])* 100, 2)
Report['Proc_Prop'] = round((Report['Auto Auto'] + Report['Manual Auto']) / (Report['Auto Auto'] + Report['Auto Manual'] + Report['Manual Auto'] + Report['Manual Manual'])* 100, 2)
我正在努力增加总销售额和平均权重这有效吗
为每个列定义聚合操作
volFunc = np.sum
acFunc = lambda x: (x.value_counts()['auto']/x.shape[0])*100
procFunc = lambda x:(x.value_counts()['auto']/x.shape[0])*100
weightFunc = np.mean
将聚合操作作为字典传递给各个列
df.pivot_table(
index='region',
aggfunc={'Vol_Sales':volFunc,
'Weight':weightFunc,
'Acquisition':acFunc,
'Processing':procFunc}
)
我制作的玩具示例:
df= pd.DataFrame({'region':['asia','asia','europe','asia','asia','asia'],
'Weight':np.random.randint(5,10,6),
'Vol_Sales':np.random.randint(100,800,6),
'Acquisition':['man','man','auto','auto','auto','auto'],
'Processing':['man','man','auto','auto','auto','man']
})
玩具示例的输出:
Acquisition Processing Vol_Sales Weight
region
asia 60.0 40.0 2269 6.6
europe 100.0 100.0 268 7.0
根据需要重命名和重新排列结果列我想知道为什么当我通过聚合操作时,它会返回KeyError:“Auto”我的表数据有“Auto”和“Manual”作为输入,而我的输入就是这样。Vol和Weight工作正常,只是对于Acq和Proc,给我那个keyrerror:'Auto'
acFunc=lambda x:'x.value\u counts()['Auto']/x.shape[0])*100
看看我在这个聚合函数中如何调用'Auto'
,如果你的输入被调用为'Auto'
,你需要大写,否则键将不存在,我用“Auto”而不是“Auto”大写,因为我的输入是AutoTotal_Func=np.sum Acq_Func=lambda x:(x.value_counts()['Auto']/x.shape[0])*100 Proc_Func=lambda x:(x.value_counts()['Auto']/x.shape[0])*100 Weight_Func=np mean
我写了这个,当我运行Table=pd.pivot_表(过滤的子表,索引=['REGION','SOURCE code'],aggfunc时得到了一个键错误={'RETRIEVAL\u METHOD':Acq\u Func,'PROCESS\u METHOD':Proc\u Func,'Weights':Weights\u Func})
这就是我添加的内容,可能是您的值不是以整洁的字符串的形式出现的。使用采集
列并运行.value\u counts()
在其上,查看您得到的索引。每个索引都应该是一个唯一的值,显示在您的列中,以及每个唯一值出现的次数。最接近auto
的索引应该是您想要作为键的索引