Python 分组与加权平均

Python 分组与加权平均,python,pandas,Python,Pandas,我有一个数据框: import pandas as pd import numpy as np df=pd.DataFrame.from_items([('STAND_ID',[1,1,2,3,3,3]),('Species',['Conifer','Broadleaves','Conifer','Broadleaves','Conifer','Conifer']), ('Height',[20,19,13,24,25,18]),('S

我有一个数据框:

import pandas as pd
import numpy as np

df=pd.DataFrame.from_items([('STAND_ID',[1,1,2,3,3,3]),('Species',['Conifer','Broadleaves','Conifer','Broadleaves','Conifer','Conifer']),
                             ('Height',[20,19,13,24,25,18]),('Stems',[1500,2000,1000,1200,1700,1000]),('Volume',[200,100,300,50,100,10])])

   STAND_ID      Species  Height  Stems  Volume
0         1      Conifer      20   1500     200
1         1  Broadleaves      19   2000     100
2         2      Conifer      13   1000     300
3         3  Broadleaves      24   1200      50
4         3      Conifer      25   1700     100
5         3      Conifer      18   1000      10
我想根据林分ID和物种进行分组,对高度和茎进行加权平均,体积作为重量和未堆叠

所以我试着:

newdf=df.groupby(['STAND_ID','Species']).agg({'Height':lambda x: np.average(x['Height'],weights=x['Volume']),
                                              'Stems':lambda x: np.average(x['Stems'],weights=x['Volume'])}).unstack()
这给了我一个错误:

builtins.KeyError:“高度”


如何修复此问题?

您的错误是因为无法使用
agg
执行多个系列/列操作。Agg以一个系列/列作为时间。让我们使用
apply
pd.concat

g = df.groupby(['STAND_ID','Species'])
newdf = pd.concat([g.apply(lambda x: np.average(x['Height'],weights=x['Volume'])), 
                   g.apply(lambda x: np.average(x['Stems'],weights=x['Volume']))], 
                   axis=1, keys=['Height','Stems']).unstack()
编辑更好的解决方案: 输出:

              Height                  Stems             
Species  Broadleaves    Conifer Broadleaves      Conifer
STAND_ID                                                
1               19.0  20.000000      2000.0  1500.000000
2                NaN  13.000000         NaN  1000.000000
3               24.0  24.363636      1200.0  1636.363636

这里似乎也有答案:
              Height                  Stems             
Species  Broadleaves    Conifer Broadleaves      Conifer
STAND_ID                                                
1               19.0  20.000000      2000.0  1500.000000
2                NaN  13.000000         NaN  1000.000000
3               24.0  24.363636      1200.0  1636.363636