Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/349.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用pivot_表时应用不同的聚合函数_Python_Python 3.x_Pandas_Dataframe - Fatal编程技术网

Python 使用pivot_表时应用不同的聚合函数

Python 使用pivot_表时应用不同的聚合函数,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有这个样本: import pandas as pd import numpy as np dic = {'name': ['j','c','q','j','c','q','j','c','q'], 'foo or bar':['foo','bar','bar','bar','foo','foo','bar','foo','foo'], 'amount':[10,20,30, 20,30,40, 200,300,400]} x = pd.DataFr

我有这个样本:

import pandas as pd
import numpy as np
dic = {'name':
       ['j','c','q','j','c','q','j','c','q'],
       'foo or bar':['foo','bar','bar','bar','foo','foo','bar','foo','foo'], 
       'amount':[10,20,30, 20,30,40, 200,300,400]}
x = pd.DataFrame(dic)
x
pd.pivot_table(x, 
               values='amount', 
               index='name', 
               columns='foo or bar', 
               aggfunc=[np.mean, np.sum])
它返回以下内容:

我只想要突出显示的列。为什么我不能像这样在aggfunc参数中指定元组

pd.pivot_table(x, 
               values='amount', 
               index='name', 
               columns='foo or bar', 
               aggfunc=[(np.mean, 'bar'), (np.sum, 'foo')])

使用
.ix
像这里()是唯一的选项吗?

我认为您不能为
aggfunc
参数指定元组,但您可以这样做:

In [259]: p = pd.pivot_table(x,
   .....:                values='amount',
   .....:                index='name',
   .....:                columns='foo or bar',
   .....:                aggfunc=[np.mean, np.sum])

In [260]: p
Out[260]:
           mean       sum
foo or bar  bar  foo  bar  foo
name
c            20  165   20  330
j           110   10  220   10
q            30  220   30  440

In [261]: p.columns = ['{0[0]}_{0[1]}'.format(col) if col[1] else col[0] for col in p.columns.tolist()]

In [262]: p.columns
Out[262]: Index(['mean_bar', 'mean_foo', 'sum_bar', 'sum_foo'], dtype='object')

In [264]: p[['mean_bar','sum_foo']]
Out[264]:
      mean_bar  sum_foo
name
c           20      330
j          110       10
q           30      440

为了能够像您提供的答案那样做到这一点,您需要为此创建适当的列。您可以通过以下方式实现:

x['foo'] = x.loc[x['foo or bar'] == 'foo', 'amount']
x['bar'] = x.loc[x['foo or bar'] == 'bar', 'amount']

In [81]: x
Out[81]: 
   amount foo or bar name    foo    bar
0      10        foo    j   10.0    NaN
1      20        bar    c    NaN   20.0
2      30        bar    q    NaN   30.0
3      20        bar    j    NaN   20.0
4      30        foo    c   30.0    NaN
5      40        foo    q   40.0    NaN
6     200        bar    j    NaN  200.0
7     300        foo    c  300.0    NaN
8     400        foo    q  400.0    NaN
然后您可以使用以下内容:

In [82]: x.pivot_table(values=['foo','bar'], index='name', aggfunc={'bar':np.mean, 'foo':sum})
Out[82]: 
        bar    foo
name              
c      20.0  330.0
j     110.0   10.0
q      30.0  440.0
这个问题涉及: