Python Pandas\u Pivot表-通过合并列的划分生成附加列
我正在尝试运行以下函数Python Pandas\u Pivot表-通过合并列的划分生成附加列,python,pandas,pivot,pivot-table,subset,Python,Pandas,Pivot,Pivot Table,Subset,我正在尝试运行以下函数 def make_europe_view(data): data['% Rev'] = data.GrossRevenue_GBP/data.GrossRevenue_GBP.sum() tmean = lambda x :stats.trim_mean(x, 0.1) pivot = pd.pivot_table(data[(data['New_category_ID'] != 0)&(data['YYYY'] == 2016)],
def make_europe_view(data):
data['% Rev'] = data.GrossRevenue_GBP/data.GrossRevenue_GBP.sum()
tmean = lambda x :stats.trim_mean(x, 0.1)
pivot = pd.pivot_table(data[(data['New_category_ID'] != 0)&(data['YYYY'] == 2016)],
index = 'New_category',
values=['GrossRevenue_GBP','MOVC_GBP','PM_GBP', '% Rev'],
aggfunc= {'MOVC_GBP':tmean,'PM_GBP':tmean,'GrossRevenue_GBP':[np.sum,tmean],'% Rev':np.sum })
pivot['% PM'] = pivot['PM_GBP']/pivot[('GrossRevenue_GBP')]['<lambda>']
#pivot['% MOVC'] = pivot['MOVC_GBP']/Tmean_GR
pivot['Country'] = 'EU'
pivot['product_cat'] = pivot.index
#pivot = pivot[['product_cat', '% Rev', 'GrossRevenue_GBP', 'MOVC_GBP', 'PM_GBP', '% PM', '% MOVC', 'Country']]
return pivot
我真的很感激你能帮我
运行list()
时透视的列名:
[('grossrene_-GBP','')('grossrene_-GBP','',('Rev','sum'),('MOVC_-GBP','','',('PM_-GBP','','','',('Country','','')('product_-cat','')
您可以对列中的多索引中的选择值使用元组:
tups = [('GrossRevenue_GBP', '<lambda>'), ('GrossRevenue_GBP', 'sum'), ('% Rev', 'sum'), ('MOVC_GBP', '<lambda>'), ('PM_GBP', '<lambda>'), ('Country', ''), ('product_cat', '')]
idx = list('ab')
cols = pd.MultiIndex.from_tuples(tups)
pivot = pd.DataFrame([[7,4,5,8,4,5,1],
[1,5,7,3,9,6,7]], columns=cols, index=idx)
print (pivot)
GrossRevenue_GBP % Rev MOVC_GBP PM_GBP Country product_cat
<lambda> sum sum <lambda> <lambda>
a 7 4 5 8 4 5 1
b 1 5 7 3 9 6 7
pd.pivot\u table(…)
之后的示例数据(pivot
)是什么?@jezrael-我在上面添加了,写入excel文件时输出pivot(数字已清理)对不起,现在我有时间回答了。请检查一下。这是一种更优雅的方式,非常感谢您的帮助!很高兴你能帮忙!周末愉快!奇怪。。在运行代码时,我遇到以下错误:AttributeError:'numpy.ndarray'对象没有属性'str'。我已经解决了这个问题-我必须在映射之前移动一段代码,这已经修复了它。谢谢你的耐心。我知道,我现在在线,我试着回答你的评论
ValueError: Wrong number of items passed 25, placement implies 1
[('GrossRevenue_GBP', '<lambda>'), ('GrossRevenue_GBP', 'sum'), ('% Rev', 'sum'), ('MOVC_GBP', '<lambda>'), ('PM_GBP', '<lambda>'), ('Country', ''), ('product_cat', '')]
tups = [('GrossRevenue_GBP', '<lambda>'), ('GrossRevenue_GBP', 'sum'), ('% Rev', 'sum'), ('MOVC_GBP', '<lambda>'), ('PM_GBP', '<lambda>'), ('Country', ''), ('product_cat', '')]
idx = list('ab')
cols = pd.MultiIndex.from_tuples(tups)
pivot = pd.DataFrame([[7,4,5,8,4,5,1],
[1,5,7,3,9,6,7]], columns=cols, index=idx)
print (pivot)
GrossRevenue_GBP % Rev MOVC_GBP PM_GBP Country product_cat
<lambda> sum sum <lambda> <lambda>
a 7 4 5 8 4 5 1
b 1 5 7 3 9 6 7
pivot['% PM'] = pivot[('PM_GBP','<lambda>')]/pivot[('GrossRevenue_GBP','<lambda>')]
print (pivot)
GrossRevenue_GBP % Rev MOVC_GBP PM_GBP Country product_cat % PM
<lambda> sum sum <lambda> <lambda>
a 7 4 5 8 4 5 1 0.571429
b 1 5 7 3 9 6 7 9.000000
#rename columns by dict
pivot = pivot.rename(columns={'<lambda>':'tmean'})
#remove multiindex
pivot.columns = pivot.columns.map('_'.join).str.strip('_')
#simply divide
pivot['% PM'] = pivot['PM_GBP_tmean']/pivot['GrossRevenue_GBP_tmean']
print (pivot)
GrossRevenue_GBP_tmean GrossRevenue_GBP_sum % Rev_sum MOVC_GBP_tmean \
a 7 4 5 8
b 1 5 7 3
PM_GBP_tmean Country product_cat % PM
a 4 5 1 0.571429
b 9 6 7 9.000000