在Python中使用多索引透视表对列值求和
我有这样的数据在Python中使用多索引透视表对列值求和,python,pandas,dataframe,pivot-table,Python,Pandas,Dataframe,Pivot Table,我有这样的数据 Employed Coverage Education Amount No Basic Bachelor 541.8029122 No Extended Bachelor 312.6400955 No Premium Bachelor 427.9560121 No Basic Bachelor 91.17931022 No
Employed Coverage Education Amount
No Basic Bachelor 541.8029122
No Extended Bachelor 312.6400955
No Premium Bachelor 427.9560121
No Basic Bachelor 91.17931022
No Basic Bachelor 533.6890081
Yes Basic Bachelor 683.484326
Yes Basic College 586.2670885
No Premium Master 725.0412884
Yes Basic Bachelor 948.3628611
我想用多索引数据透视表对金额求和,如下所示。下面是我正在关注的,但无法得到正确的结果
需要你的帮助。这是一种方法
import pandas as pd
import io
import json
s = '''\
Employed Coverage Education Amount
No Basic Bachelor 541.8029122
No Extended Bachelor 312.6400955
No Premium Bachelor 427.9560121
No Basic Bachelor 91.17931022
No Basic Bachelor 533.6890081
Yes Basic Bachelor 683.484326
Yes Basic College 586.2670885
No Premium Master 725.0412884
Yes Basic Bachelor 948.3628611'''
# Recreate the dataframe
df = pd.read_csv(io.StringIO(s), sep='\s+')
实际代码:
df['Coverage'] = df['Coverage'].astype('category')
pd.pivot_table(df, index='Education', columns=['Employed', 'Coverage'],
values='Amount', aggfunc='sum', fill_value=0)
# Employed No Yes
# Coverage Basic Extended Premium Basic Extended Premium
# Education
# Bachelor 1166.671231 312.640096 427.956012 1631.847187 0.0 0.0
# College 0.000000 0.000000 0.000000 586.267088 0.0 0.0
# Master 0.000000 0.000000 725.041288 0.000000 0.0 0.0
注:
- 转换为类别确保针对该类别报告所有场景 系列
- 透视表的默认计算是
,因此必须明确指定mean
sum
这是一种方法
import pandas as pd
import io
import json
s = '''\
Employed Coverage Education Amount
No Basic Bachelor 541.8029122
No Extended Bachelor 312.6400955
No Premium Bachelor 427.9560121
No Basic Bachelor 91.17931022
No Basic Bachelor 533.6890081
Yes Basic Bachelor 683.484326
Yes Basic College 586.2670885
No Premium Master 725.0412884
Yes Basic Bachelor 948.3628611'''
# Recreate the dataframe
df = pd.read_csv(io.StringIO(s), sep='\s+')
实际代码:
df['Coverage'] = df['Coverage'].astype('category')
pd.pivot_table(df, index='Education', columns=['Employed', 'Coverage'],
values='Amount', aggfunc='sum', fill_value=0)
# Employed No Yes
# Coverage Basic Extended Premium Basic Extended Premium
# Education
# Bachelor 1166.671231 312.640096 427.956012 1631.847187 0.0 0.0
# College 0.000000 0.000000 0.000000 586.267088 0.0 0.0
# Master 0.000000 0.000000 725.041288 0.000000 0.0 0.0
注:
- 转换为类别确保针对该类别报告所有场景 系列
- 透视表的默认计算是
,因此必须明确指定mean
sum