在Python中使用多索引透视表对列值求和

在Python中使用多索引透视表对列值求和,python,pandas,dataframe,pivot-table,Python,Pandas,Dataframe,Pivot Table,我有这样的数据 Employed Coverage Education Amount No Basic Bachelor 541.8029122 No Extended Bachelor 312.6400955 No Premium Bachelor 427.9560121 No Basic Bachelor 91.17931022 No

我有这样的数据

Employed    Coverage    Education   Amount
No          Basic       Bachelor    541.8029122
No          Extended    Bachelor    312.6400955
No          Premium     Bachelor    427.9560121
No          Basic       Bachelor    91.17931022
No          Basic       Bachelor    533.6890081
Yes         Basic       Bachelor    683.484326
Yes         Basic       College     586.2670885
No          Premium     Master      725.0412884
Yes         Basic       Bachelor    948.3628611
我想用多索引数据透视表对金额求和,如下所示。下面是我正在关注的,但无法得到正确的结果

需要你的帮助。

这是一种方法

import pandas as pd
import io
import json

s = '''\
Employed    Coverage    Education   Amount
No          Basic       Bachelor    541.8029122
No          Extended    Bachelor    312.6400955
No          Premium     Bachelor    427.9560121
No          Basic       Bachelor    91.17931022
No          Basic       Bachelor    533.6890081
Yes         Basic       Bachelor    683.484326
Yes         Basic       College     586.2670885
No          Premium     Master      725.0412884
Yes         Basic       Bachelor    948.3628611'''

# Recreate the dataframe
df = pd.read_csv(io.StringIO(s), sep='\s+')
实际代码:

df['Coverage'] = df['Coverage'].astype('category')

pd.pivot_table(df, index='Education', columns=['Employed', 'Coverage'],
               values='Amount', aggfunc='sum', fill_value=0)

# Employed            No                                  Yes                 
# Coverage         Basic    Extended     Premium        Basic Extended Premium
# Education                                                                   
# Bachelor   1166.671231  312.640096  427.956012  1631.847187      0.0     0.0
# College       0.000000    0.000000    0.000000   586.267088      0.0     0.0
# Master        0.000000    0.000000  725.041288     0.000000      0.0     0.0
注:

  • 转换为类别确保针对该类别报告所有场景 系列
  • 透视表的默认计算是
    mean
    ,因此必须明确指定
    sum
  • 这是一种方法

    import pandas as pd
    import io
    import json
    
    s = '''\
    Employed    Coverage    Education   Amount
    No          Basic       Bachelor    541.8029122
    No          Extended    Bachelor    312.6400955
    No          Premium     Bachelor    427.9560121
    No          Basic       Bachelor    91.17931022
    No          Basic       Bachelor    533.6890081
    Yes         Basic       Bachelor    683.484326
    Yes         Basic       College     586.2670885
    No          Premium     Master      725.0412884
    Yes         Basic       Bachelor    948.3628611'''
    
    # Recreate the dataframe
    df = pd.read_csv(io.StringIO(s), sep='\s+')
    
    实际代码:

    df['Coverage'] = df['Coverage'].astype('category')
    
    pd.pivot_table(df, index='Education', columns=['Employed', 'Coverage'],
                   values='Amount', aggfunc='sum', fill_value=0)
    
    # Employed            No                                  Yes                 
    # Coverage         Basic    Extended     Premium        Basic Extended Premium
    # Education                                                                   
    # Bachelor   1166.671231  312.640096  427.956012  1631.847187      0.0     0.0
    # College       0.000000    0.000000    0.000000   586.267088      0.0     0.0
    # Master        0.000000    0.000000  725.041288     0.000000      0.0     0.0
    
    注:

    • 转换为类别确保针对该类别报告所有场景 系列
    • 透视表的默认计算是
      mean
      ,因此必须明确指定
      sum