Pivot表中的Python多重计算_Python

Pivot表中的Python多重计算

python

Pivot表中的Python多重计算,python,Python,我想做一个有多重计算的数据透视表。例如，我的表最初看起来是这样的 +---------+----------+----------+---------+------------+-----------+ | Region | Country | Vol_Sales| Weight | Acquisition| Processing| +---------+----------+----------+---------+------------+-----------+ | Asia

我想做一个有多重计算的数据透视表。例如，我的表最初看起来是这样的

+---------+----------+----------+---------+------------+-----------+
| Region  | Country  | Vol_Sales| Weight  | Acquisition| Processing|
+---------+----------+----------+---------+------------+-----------+
| Asia    | Japan    |  400     |  6      | Auto       | Manual    |        
+---------+----------+----------+---------+------------+-----------+
| Asia    | Singapore|  700     |  7      | Auto       | Auto      |
+---------+----------+----------+---------+------------+-----------+
| Europe  | UK       |  600     |  8      | Manual     | Auto      |
+---------+----------+----------+---------+------------+-----------+
| .       | .        |          |         |            |           |
+---------+----------+----------+---------+------------+-----------+
| .       | .        |          |         |            |           |
+---------+----------+----------+---------+------------+-----------+
| .       | .        |          |         |            |           |
+---------+----------+----------+---------+------------+-----------+
| Africa  | Egypt    | 700      |  7      | Auto       | Auto      |
+---------+----------+----------+---------+------------+-----------+



+---------+----------+----------+---------+------------+-----------+
|Region   |Tot_Sales |Acq_Prop  |Proc_Prop|Avg_Weight  |           |
+---------+----------+----------+---------+------------+-----------+
| Asia    | 80,000   | 80.6     | 70.2    | 7.2        |           |
+---------+----------+----------+---------+------------+-----------+

Total_Sales = Sum of Vol_Sales
Acq_Prop = Acquisition(Auto) / Total(Auto+Manual) * 100
Proc_Prop = Processing(Auto) / Total(Auto+Manual) * 100
Avg_Weights = Average of Weight group by Region

我希望轴的结果如下所示

+---------+----------+----------+---------+------------+-----------+
| Region  | Country  | Vol_Sales| Weight  | Acquisition| Processing|
+---------+----------+----------+---------+------------+-----------+
| Asia    | Japan    |  400     |  6      | Auto       | Manual    |        
+---------+----------+----------+---------+------------+-----------+
| Asia    | Singapore|  700     |  7      | Auto       | Auto      |
+---------+----------+----------+---------+------------+-----------+
| Europe  | UK       |  600     |  8      | Manual     | Auto      |
+---------+----------+----------+---------+------------+-----------+
| .       | .        |          |         |            |           |
+---------+----------+----------+---------+------------+-----------+
| .       | .        |          |         |            |           |
+---------+----------+----------+---------+------------+-----------+
| .       | .        |          |         |            |           |
+---------+----------+----------+---------+------------+-----------+
| Africa  | Egypt    | 700      |  7      | Auto       | Auto      |
+---------+----------+----------+---------+------------+-----------+



+---------+----------+----------+---------+------------+-----------+
|Region   |Tot_Sales |Acq_Prop  |Proc_Prop|Avg_Weight  |           |
+---------+----------+----------+---------+------------+-----------+
| Asia    | 80,000   | 80.6     | 70.2    | 7.2        |           |
+---------+----------+----------+---------+------------+-----------+

Total_Sales = Sum of Vol_Sales
Acq_Prop = Acquisition(Auto) / Total(Auto+Manual) * 100
Proc_Prop = Processing(Auto) / Total(Auto+Manual) * 100
Avg_Weights = Average of Weight group by Region

到目前为止，我所做的是

Sales_Report = Sales_Report.assign(counter = 1)
Report = pd.pivot_table(Sales_Report, index = ['REGION'], columns = ['Acquisition', 'Processing'],  values = ['counter'], aggfunc = 'sum', fill_value = 0, margins = True, margins_name = 'Total')

然后我重新安排桌子

create a list of the new column names in the right order
new_cols=[('{1} {2}'.format(*tup)) for tup in Table.columns]

assign it to the Table
Report.columns = new_cols

resort the index, so you get the columns in the order you specified
Report.sort_index(axis='columns')

最后得到计算出的列

Report['Acq_Prop'] = round((Report['Auto Auto'] + Report['Auto Manual']) / (Report['Auto Auto'] + Report['Auto Manual'] + Report['Manual Auto'] + Report['Manual Manual'])* 100, 2)
Report['Proc_Prop'] = round((Report['Auto Auto'] + Report['Manual Auto']) / (Report['Auto Auto'] + Report['Auto Manual'] + Report['Manual Auto'] + Report['Manual Manual'])* 100, 2)

我正在努力增加总销售额和平均权重

这有效吗

为每个列定义聚合操作

volFunc = np.sum
acFunc = lambda x: (x.value_counts()['auto']/x.shape[0])*100
procFunc = lambda x:(x.value_counts()['auto']/x.shape[0])*100
weightFunc = np.mean

将聚合操作作为字典传递给各个列

df.pivot_table(
    index='region',
    aggfunc={'Vol_Sales':volFunc,
             'Weight':weightFunc,
             'Acquisition':acFunc,
             'Processing':procFunc}
)

我制作的玩具示例：

df= pd.DataFrame({'region':['asia','asia','europe','asia','asia','asia'],
                  'Weight':np.random.randint(5,10,6),
                  'Vol_Sales':np.random.randint(100,800,6),
                  'Acquisition':['man','man','auto','auto','auto','auto'],
                  'Processing':['man','man','auto','auto','auto','man']
                 })

玩具示例的输出：

    Acquisition Processing  Vol_Sales   Weight
region              
asia    60.0    40.0        2269        6.6
europe  100.0   100.0       268         7.0

根据需要重命名和重新排列结果列

我想知道为什么当我通过聚合操作时，它会返回KeyError:“Auto”我的表数据有“Auto”和“Manual”作为输入，而我的输入就是这样。Vol和Weight工作正常，只是对于Acq和Proc，给我那个keyrerror:'Auto'

acFunc=lambda x:'x.value\u counts（）['Auto']/x.shape[0]）*100

看看我在这个聚合函数中如何调用

'Auto'

，如果你的输入被调用为

'Auto'

，你需要大写，否则键将不存在，我用“Auto”而不是“Auto”大写，因为我的输入是Auto

Total_Func=np.sum Acq_Func=lambda x:（x.value_counts（）['Auto']/x.shape[0]）*100 Proc_Func=lambda x:（x.value_counts（）['Auto']/x.shape[0]）*100 Weight_Func=np mean

我写了这个，当我运行

Table=pd.pivot_表（过滤的子表，索引=['REGION'，'SOURCE code']，aggfunc时得到了一个键错误={'RETRIEVAL\u METHOD'：Acq\u Func，'PROCESS\u METHOD'：Proc\u Func，'Weights'：Weights\u Func}）

这就是我添加的内容，可能是您的值不是以整洁的字符串的形式出现的。使用

采集

列并运行

.value\u counts（）

在其上，查看您得到的索引。每个索引都应该是一个唯一的值，显示在您的列中，以及每个唯一值出现的次数。最接近

auto

的索引应该是您想要作为键的索引