Python 多索引数据透视表小计_Python_Pandas_Pivot Table_Pandas Groupby

Python 多索引数据透视表小计

python pandas

Python 多索引数据透视表小计,python,pandas,pivot-table,pandas-groupby,Python,Pandas,Pivot Table,Pandas Groupby,我正在尝试创建一个简单的数据透视表，使用excel风格的小计，但是我找不到使用Pandas的方法。我已经尝试了Wes在另一个小计相关问题中提出的解决方案，但是没有给出预期的结果。下面是复制它的步骤：创建示例数据： sample_data = {'customer': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'A', 'B', 'B', 'B'], 'product': ['astro','ball','car','astro','ball', 'car

我正在尝试创建一个简单的数据透视表，使用excel风格的小计，但是我找不到使用Pandas的方法。我已经尝试了Wes在另一个小计相关问题中提出的解决方案，但是没有给出预期的结果。下面是复制它的步骤：

创建示例数据：

sample_data = {'customer': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'A', 'B', 'B', 'B'], 'product': ['astro','ball','car','astro','ball', 'car', 'astro', 'ball', 'car','astro','ball','car'],
'week': [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2],
'qty': [10, 15, 20, 40, 20, 34, 300, 20, 304, 23, 45, 23]}

df = pd.DataFrame(sample_data)

创建带边距的透视表（按客户（A、B）只包含总计，不包含小计）

然后，我尝试了Wes Mckiney在另一个线程中提到的方法，使用stack函数：

piv2 = df.pivot_table(index='customer',columns=['week','product'],values='qty',margins=True,aggfunc=np.sum)

piv2.stack('product')

结果具有我想要的格式，但带有“All”的行没有总和：

    week               1    2   All
customer    product         
A                    NaN    NaN    669.0
        astro       10.0    300.0   NaN
        ball        15.0    20.0    NaN
        car         20.0    304.0   NaN
B                    NaN    NaN    185.0
        astro        40.0   23.0    NaN
        ball         20.0   45.0    NaN
        car         34.0    23.0    NaN
All                  NaN    NaN     854.0
        astro        50.0   323.0   NaN
        ball         35.0   65.0    NaN
        car         54.0    327.0   NaN

如何使其在Excel中正常工作，示例如下？所有的小计和总计都有效吗？我错过了什么？预计起飞时间

说到这里，我可以在每次迭代和以后的concat中使用For循环过滤，但我希望可能会有更直接的解决方案。谢谢你，你可以一步完成，但由于按字母顺序排序，你必须对索引名采取策略：

piv = df.pivot_table(index=['customer','product'],
                     columns='week',
                     values='qty',
                     margins=True,
                     margins_name='Total',
                     aggfunc=np.sum)

(pd.concat([piv, 
            piv.query('customer != "Total"')
               .sum(level=0)
               .assign(product='total')
               .set_index('product', append=True)])
   .sort_index())

输出：

week                1    2  Total
customer product                 
A        astro     10  300    310
         ball      15   20     35
         car       20  304    324
         total     45  624    669
B        astro     40   23     63
         ball      20   45     65
         car       34   23     57
         total     94   91    185
Total             139  715    854

@斯科特·波士顿的答案完美而优雅。作为参考，如果您仅将客户分组并

pd.concat（）

得到以下结果

piv = df.pivot_table(index=['customer','product'],columns='week',values='qty',margins=True,aggfunc=np.sum)
piv3 =  df.pivot_table(index=['customer'],columns='week',values='qty',margins=True,aggfunc=np.sum)
piv4 = pd.concat([piv, piv3], axis=0)

piv4
week    1   2   All
(A, astro)  10  300 310
(A, ball)   15  20  35
(A, car)    20  304 324
(B, astro)  40  23  63
(B, ball)   20  45  65
(B, car)    34  23  57
(All, ) 139 715 854
A   45  624 669
B   94  91  185
All 139 715 854

嗨，斯科特，简单而优雅的解决方案，它在我的真实数据框架中工作得很好。你能给我推荐一些资料，让我能更深入地了解你为解决问题所采取的步骤吗？。非常感谢你，克莱顿。我从回答有关堆栈溢出的问题中学到了这些步骤。然而，www.dunderdata.com有一些很棒的培训信息。这也是我很高兴刚刚学到的另一种可能性。非常感谢。

piv = df.pivot_table(index=['customer','product'],columns='week',values='qty',margins=True,aggfunc=np.sum)
piv3 =  df.pivot_table(index=['customer'],columns='week',values='qty',margins=True,aggfunc=np.sum)
piv4 = pd.concat([piv, piv3], axis=0)

piv4
week    1   2   All
(A, astro)  10  300 310
(A, ball)   15  20  35
(A, car)    20  304 324
(B, astro)  40  23  63
(B, ball)   20  45  65
(B, car)    34  23  57
(All, ) 139 715 854
A   45  624 669
B   94  91  185
All 139 715 854