Python 每个列的数据透视表小计
我是否可以使用pandas中的pivot_表实现所需的输出(如下所示)或类似的数据集。我正在尝试做一些类似的事情:Python 每个列的数据透视表小计,python,pandas,Python,Pandas,我是否可以使用pandas中的pivot_表实现所需的输出(如下所示)或类似的数据集。我正在尝试做一些类似的事情: pivot_table(df, rows=['region'], cols=['area','distributor','salesrep'], aggfunc=np.sum, margins=True).stack(['area','distributor','salesrep']) 但我只得到每个区域的小计,若我将区域从cols移动到rows,那个么
pivot_table(df, rows=['region'], cols=['area','distributor','salesrep'],
aggfunc=np.sum, margins=True).stack(['area','distributor','salesrep'])
但我只得到每个区域的小计,若我将区域从cols移动到rows,那个么我将只得到每个区域的小计
数据集:
region area distributor salesrep sales invoice_count
Central Butterworth HIN MARKETING TLS 500 25
Central Butterworth HIN MARKETING TLS 500 25
Central Butterworth HIN MARKETING OSE 500 25
Central Butterworth HIN MARKETING OSE 500 25
Central Butterworth KWANG HENGG TCS 500 25
Central Butterworth KWANG HENGG TCS 500 25
Central Butterworth KWANG HENG LBH 500 25
Central Butterworth KWANG HENG LBH 500 25
Central Ipoh SGH EDERAN CHAN 500 25
Central Ipoh SGH EDERAN CHAN 500 25
Central Ipoh SGH EDERAN KAMACHI 500 25
Central Ipoh SGH EDERAN KAMACHI 500 25
Central Ipoh CORE SYN LILIAN 500 25
Central Ipoh CORE SYN LILIAN 500 25
Central Ipoh CORE SYN TEOH 500 25
Central Ipoh CORE SYN TEOH 500 25
East JB LEI WAH NF05 500 25
East JB LEI WAH NF05 500 25
East JB LEI WAH NF06 500 25
East JB LEI WAH NF06 500 25
East JB WONDER F&B SEREN 500 25
East JB WONDER F&B SEREN 500 25
East JB WONDER F&B MONC 500 25
East JB WONDER F&B MONC 500 25
East PJ PENGEDAR NORM 500 25
East PJ PENGEDAR NORM 500 25
East PJ PENGEDAR SIMON 500 25
East PJ PENGEDAR SIMON 500 25
East PJ HEBAT OGI 500 25
East PJ HEBAT OGI 500 25
East PJ HEBAT MIGI 500 25
East PJ HEBAT MIGI 500 25
地区分销商销售代表销售发票\u计数
中环巴特沃斯欣营销TLS 500 25
中环巴特沃斯欣营销TLS 500 25
中环巴特沃斯欣市场部OSE 500 25
中环巴特沃斯欣市场部OSE 500 25
中环巴特沃斯光亨TCS 500 25
中环巴特沃斯光亨TCS 500 25
中环巴特沃斯光亨LBH 500 25
中环巴特沃斯光亨LBH 500 25
中环怡保SGH EDERAN CHAN 500 25
中环怡保SGH EDERAN CHAN 500 25
中环怡保SGH EDERAN KAMACHI 500 25
中环怡保SGH EDERAN KAMACHI 500 25
中环怡保核心SYN LILIAN 500 25
中环怡保核心SYN LILIAN 500 25
中环怡保核心系统TEOH 500 25
中环怡保核心系统TEOH 500 25
东JB利华NF05 500 25
东JB利华NF05 500 25
东JB利华NF06 500 25
东JB利华NF06 500 25
东JB WONDER餐饮SEREN 500 25
东JB WONDER餐饮SEREN 500 25
东JB WONDER餐饮MONC 500 25
东JB WONDER餐饮MONC 500 25
东PJ彭格达尔标准500 25
东PJ彭格达尔标准500 25
东PJ PENGEDAR SIMON 500 25
东PJ PENGEDAR SIMON 500 25
东PJ赫巴特奥吉500 25
东PJ赫巴特奥吉500 25
东PJ HEBAT MIGI 500 25
东PJ HEBAT MIGI 500 25
期望输出:
region area distributor salesrep invoice_count sales
Grand Total 800 16000
Central Central Total 400 8000
Central Butterworth Butterworth Total 200 4000
Central Butterworth HIN MARKETING HIN MARKETING Total 100 2000
Central Butterworth HIN MARKETING OSE 50 1000
Central Butterworth HIN MARKETING TLS 50 1000
Central Butterworth KWANG HENG KWANG HENG Total 100 2000
Central Butterworth KWANG HENG LBH 50 1000
Central Butterworth KWANG HENG TCS 50 1000
Central Ipoh Ipoh Total 200 4000
Central Ipoh CORE SYN CORE SYN Total 100 2000
Central Ipoh CORE SYN LILIAN 50 1000
Central Ipoh CORE SYN TEOH 50 1000
Central Ipoh SGH EDERAN SGH EDERAN Total 100 2000
Central Ipoh SGH EDERAN CHAN 50 1000
Central Ipoh SGH EDERAN KAMACHI 50 1000
East East Total 400 8000
East JB JB Total 200 4000
East JB LEI WAH LEI WAH Total 100 2000
East JB LEI WAH NF05 50 1000
East JB LEI WAH NF06 50 1000
East JB WONDER F&B WONDER F&B Total 100 2000
East JB WONDER F&B MONC 50 1000
East JB WONDER F&B SEREN 50 1000
East PJ PJ Total 200 4000
East PJ HEBAT HEBAT Total 100 2000
East PJ HEBAT MIGI 50 1000
East PJ HEBAT OGI 50 1000
East PJ PENGEDAR PENDEGAR Total 100 2000
East PJ PENGEDAR NORM 50 1000
East PJ PENGEDAR SIMON 50 1000
地区分销商销售代表发票数量销售
总计800 16000
中环总数400 8000
巴特沃斯市中心巴特沃斯总计200 4000
中环巴特沃斯欣营销欣营销总计100 2000
中环巴特沃斯欣市场部OSE 50 1000
中环巴特沃斯欣营销TLS 50 1000
中环巴特沃斯光亨光亨总计100 2000
中环巴特沃斯光亨LBH 50 1000
中环巴特沃斯光亨TCS 50 1000
中环怡保Ipoh总数200 4000
中环怡保核心同步器核心同步器总计100 2000
中环怡保核心SYN LILIAN 50 1000
中环怡保核心系统TEOH 50 1000
中环怡保SGH EDERAN SGH EDERAN总计100 2000
中环怡保SGH EDERAN CHAN 50 1000
中环怡保SGH EDERAN KAMACHI 50 1000
东区总数400 8000
东区JB总计200 4000
东JB利华利华总数100 2000
东JB利华NF05 50 1000
东JB利华NF06 50 1000
东JB旺德餐饮旺德餐饮总计100 2000
东JB WONDER餐饮MONC 50 1000
东JB WONDER餐饮公司SEREN 50 1000
东PJ PJ总计200 4000
东PJ HEBAT HEBAT总计100 2000
东PJ HEBAT MIGI 50 1000
东PJ HEBAT OGI 50 1000
东PJ PENGEDAR PENDEGAR总计100 2000
东PJ彭格达尔标准50 1000
东PJ PENGEDAR SIMON 50 1000
我不知道如何在表中获得小计,但是如果您运行
df.pivot_table(rows=['region','area','distributor','salesrep'],
aggfunc=np.sum, margins=True)
你会得到
invoice_count sales
region area distributor salesrep
Central Butterworth HIN MARKETING OSE 50 1000
TLS 50 1000
KWANG HENG LBH 50 1000
KWANG HENGG TCS 50 1000
Ipoh CORE SYN LILIAN 50 1000
TEOH 50 1000
SGH EDERAN CHAN 50 1000
KAMACHI 50 1000
East JB LEI WAH NF05 50 1000
NF06 50 1000
WONDER F&B MONC 50 1000
SEREN 50 1000
PJ HEBAT MIGI 50 1000
OGI 50 1000
PENGEDAR NORM 50 1000
SIMON 50 1000
All 800 16000
如果您想要基于例如区域
和区域
的总计,您可以运行
df.pivot_table(rows=['region', 'area'], aggfunc=np.sum, margins=True)
导致
invoice_count sales
region area
Central Butterworth 200 4000
Ipoh 200 4000
East JB 200 4000
PJ 200 4000
All 800 16000
我不知道如何在表中获得小计,但是如果您运行
df.pivot_table(rows=['region','area','distributor','salesrep'],
aggfunc=np.sum, margins=True)
你会得到
invoice_count sales
region area distributor salesrep
Central Butterworth HIN MARKETING OSE 50 1000
TLS 50 1000
KWANG HENG LBH 50 1000
KWANG HENGG TCS 50 1000
Ipoh CORE SYN LILIAN 50 1000
TEOH 50 1000
SGH EDERAN CHAN 50 1000
KAMACHI 50 1000
East JB LEI WAH NF05 50 1000
NF06 50 1000
WONDER F&B MONC 50 1000
SEREN 50 1000
PJ HEBAT MIGI 50 1000
OGI 50 1000
PENGEDAR NORM 50 1000
SIMON 50 1000
All 800 16000
如果您想要基于例如区域
和区域
的总计,您可以运行
df.pivot_table(rows=['region', 'area'], aggfunc=np.sum, margins=True)
导致
invoice_count sales
region area
Central Butterworth 200 4000
Ipoh 200 4000
East JB 200 4000
PJ 200 4000
All 800 16000
我们可以使用
groupby
而不是pivot\u表
:
import numpy as np
import pandas as pd
def label(ser):
return '{s} Total'.format(s=ser)
filename = 'data.txt'
df = pd.read_table(filename, delimiter='\t')
total = pd.DataFrame({'region': ['Grand Total'],
'invoice_count': df['invoice_count'].sum(),
'sales': df['sales'].sum()})
total['total_rank'] = 1
region_total = df.groupby(['region'], as_index=False).sum()
region_total['area'] = region_total['region'].apply(label)
region_total['region_rank'] = 1
area_total = df.groupby(['region', 'area'], as_index=False).sum()
area_total['distributor'] = area_total['area'].apply(label)
area_total['area_rank'] = 1
dist_total = df.groupby(
['region', 'area', 'distributor'], as_index=False).sum()
dist_total['salesrep'] = dist_total['distributor'].apply(label)
rep_total = df.groupby(
['region', 'area', 'distributor', 'salesrep'], as_index=False).sum()
# UNION the DataFrames into one DataFrame
result = pd.concat([total, region_total, area_total, dist_total, rep_total])
# Replace NaNs with empty strings
result.fillna({'region': '', 'area': '', 'distributor': '', 'salesrep':
''}, inplace=True)
# Reorder the rows
sorter = np.lexsort((
result['distributor'].rank(),
result['area_rank'].rank(),
result['area'].rank(),
result['region_rank'].rank(),
result['region'].rank(),
result['total_rank'].rank()))
result = result.take(sorter)
result = result.reindex(
columns=['region', 'area', 'distributor', 'salesrep', 'invoice_count', 'sales'])
print(result.to_string(index=False))
屈服
region area distributor salesrep invoice_count sales
Grand Total 800 16000
Central Central Total 400 8000
Central Butterworth Butterworth Total 200 4000
Central Butterworth HIN MARKETING HIN MARKETING Total 100 2000
Central Butterworth HIN MARKETING OSE 50 1000
Central Butterworth HIN MARKETING TLS 50 1000
Central Butterworth KWANG HENG KWANG HENG Total 100 2000
Central Butterworth KWANG HENG LBH 50 1000
Central Butterworth KWANG HENG TCS 50 1000
Central Ipoh Ipoh Total 200 4000
Central Ipoh CORE SYN CORE SYN Total 100 2000
Central Ipoh CORE SYN LILIAN 50 1000
Central Ipoh CORE SYN TEOH 50 1000
Central Ipoh SGH EDERAN SGH EDERAN Total 100 2000
Central Ipoh SGH EDERAN CHAN 50 1000
Central Ipoh SGH EDERAN KAMACHI 50 1000
East East Total 400 8000
East JB JB Total 200 4000
East JB LEI WAH LEI WAH Total 100 2000
East JB LEI WAH NF05 50 1000
East JB LEI WAH NF06 50 1000
East JB WONDER F&B WONDER F&B Total 100 2000
East JB WONDER F&B MONC 50 1000
East JB WONDER F&B SEREN 50 1000
East PJ PJ Total 200 4000
East PJ HEBAT HEBAT Total 100 2000
East PJ HEBAT MIGI 50 1000
East PJ HEBAT OGI 50 1000
East PJ PENGEDAR PENGEDAR Total 100 2000
East PJ PENGEDAR NORM 50 1000
East PJ PENGEDAR SIMON 50 1000
我们可以使用
groupby
而不是pivot\u表
:
import numpy as np
import pandas as pd
def label(ser):
return '{s} Total'.format(s=ser)
filename = 'data.txt'
df = pd.read_table(filename, delimiter='\t')
total = pd.DataFrame({'region': ['Grand Total'],
'invoice_count': df['invoice_count'].sum(),
'sales': df['sales'].sum()})
total['total_rank'] = 1
region_total = df.groupby(['region'], as_index=False).sum()
region_total['area'] = region_total['region'].apply(label)
region_total['region_rank'] = 1
area_total = df.groupby(['region', 'area'], as_index=False).sum()
area_total['distributor'] = area_total['area'].apply(label)
area_total['area_rank'] = 1
dist_total = df.groupby(
['region', 'area', 'distributor'], as_index=False).sum()
dist_total['salesrep'] = dist_total['distributor'].apply(label)
rep_total = df.groupby(
['region', 'area', 'distributor', 'salesrep'], as_index=False).sum()
# UNION the DataFrames into one DataFrame
result = pd.concat([total, region_total, area_total, dist_total, rep_total])
# Replace NaNs with empty strings
result.fillna({'region': '', 'area': '', 'distributor': '', 'salesrep':
''}, inplace=True)
# Reorder the rows
sorter = np.lexsort((
result['distributor'].rank(),
result['area_rank'].rank(),
result['area'].rank(),
result['region_rank'].rank(),
result['region'].rank(),
result['total_rank'].rank()))
result = result.take(sorter)
result = result.reindex(
columns=['region', 'area', 'distributor', 'salesrep', 'invoice_count', 'sales'])
print(result.to_string(index=False))
屈服
region area distributor salesrep invoice_count sales
Grand Total 800 16000
Central Central Total 400 8000
Central Butterworth Butterworth Total 200 4000
Central Butterworth HIN MARKETING HIN MARKETING Total 100 2000
Central Butterworth HIN MARKETING OSE 50 1000
Central Butterworth HIN MARKETING TLS 50 1000
Central Butterworth KWANG HENG KWANG HENG Total 100 2000
Central Butterworth KWANG HENG LBH 50 1000
Central Butterworth KWANG HENG TCS 50 1000
Central Ipoh Ipoh Total 200 4000
Central Ipoh CORE SYN CORE SYN Total 100 2000
Central Ipoh CORE SYN LILIAN 50 1000
Central Ipoh CORE SYN TEOH 50 1000
Central Ipoh SGH EDERAN SGH EDERAN Total 100 2000
Central Ipoh SGH EDERAN CHAN 50 1000
Central Ipoh SGH EDERAN KAMACHI 50 1000
East East Total 400 8000
East JB JB Total 200 4000
East JB LEI WAH LEI WAH Total 100 2000
East JB LEI WAH NF05 50 1000
East JB LEI WAH NF06 50 1000
East JB WONDER F&B WONDER F&B Total 100 2000
East JB WONDER F&B MONC 50 1000
East JB WONDER F&B SEREN 50 1000
East PJ PJ Total 200 4000
East PJ HEBAT HEBAT Total 100 2000
East PJ HEBAT MIGI 50 1000
East PJ HEBAT OGI 50 1000
East PJ PENGEDAR PENGEDAR Total 100 2000
East PJ PENGEDAR NORM 50 1000
East PJ PENGEDAR SIMON 50 1000
谢谢我顾