Python 向多索引数据帧添加平均列_Python_Pandas_Dataframe_Multi Index

Python 向多索引数据帧添加平均列

python pandas dataframe

Python 向多索引数据帧添加平均列,python,pandas,dataframe,multi-index,Python,Pandas,Dataframe,Multi Index,我有一个数据帧df first bar baz second one two one two A 0.487880 -0.487661 -1.030176 0.100813 B 0.267913 1.918923 0.132791 0.178503 C 1.550526 -0.312235 -1.177689 -0.081596

我有一个数据帧

df

first        bar                 baz           
second       one       two       one       two 
A       0.487880 -0.487661 -1.030176  0.100813 
B       0.267913  1.918923  0.132791  0.178503
C       1.550526 -0.312235 -1.177689 -0.081596

我想添加一个平均值列，然后将平均值移到前面

df['Average'] = df.mean(level='second', axis='columns')  #ERROR HERE
cols = df.columns.tolist()
df = df[[cols[-1]] + cols[:-1]]

我得到一个错误：

ValueError: Wrong number of items passed 2, placement implies 1

也许，我可以将每一列

df['Average'，'One']=…

一次添加到平均值中，但这似乎很愚蠢，尤其是因为现实生活中的索引更为复杂

编辑：（）

我不确定你的目标产量。像这样的

df2 = df.mean(level='second', axis='columns')
df2.columns = pd.MultiIndex.from_tuples([('mean', col) for col in df2])
>>> df2
       mean          
        one       two
A -0.271148 -0.193424
B  0.200352  1.048713
C  0.186419 -0.196915

>>> pd.concat([df2, df], axis=1)
       mean                 bar                 baz          
        one       two       one       two       one       two
A -0.271148 -0.193424  0.487880 -0.487661 -1.030176  0.100813
B  0.200352  1.048713  0.267913  1.918923  0.132791  0.178503
C  0.186419 -0.196915  1.550526 -0.312235 -1.177689 -0.081596

之所以会出现错误，是因为

mean

操作会导致一个数据帧（本例中有两列）。然后，您尝试将此结果分配到原始数据帧中的一列中。

pandas.concat

堆栈

取消堆栈

不一定高效，但整洁

df.stack().assign(Average=df.mean(level='second', axis='columns').stack()).unstack()

first        bar                 baz             Average          
second       one       two       one       two       one       two
A       0.255301  0.286846  1.027024 -0.060594  0.641162  0.113126
B      -0.608509 -2.291201  0.675753 -0.416156  0.033622 -1.353679
C       2.714254 -1.330621 -0.099545  0.616833  1.307354 -0.356894

您能提供生成此数据帧的代码吗。我不确定我是否100%理解这个错误。另一个

level=first

各有两列。我以为我是在那个层次上插入的。你不是在插入

level='first'

表示您正在对该级别的项目进行分组（例如“bar”和“baz”），并在本例中取平均值。正如我在上面尝试演示的那样，结果只是一个数据帧，它使用

level='second'

避免与注释中的示例混淆。

df.join(pd.concat([df.mean(level='second', axis='columns')], axis=1, keys=['Average']))

first        bar                 baz             Average          
second       one       two       one       two       one       two
A       0.255301  0.286846  1.027024 -0.060594  0.641162  0.113126
B      -0.608509 -2.291201  0.675753 -0.416156  0.033622 -1.353679
C       2.714254 -1.330621 -0.099545  0.616833  1.307354 -0.356894

df.stack().assign(Average=df.mean(level='second', axis='columns').stack()).unstack()

first        bar                 baz             Average          
second       one       two       one       two       one       two
A       0.255301  0.286846  1.027024 -0.060594  0.641162  0.113126
B      -0.608509 -2.291201  0.675753 -0.416156  0.033622 -1.353679
C       2.714254 -1.330621 -0.099545  0.616833  1.307354 -0.356894