Python 熊猫的多指标选择_Python_Pandas_Multi Index

Python 熊猫的多指标选择

python pandas

Python 熊猫的多指标选择,python,pandas,multi-index,Python,Pandas,Multi Index,我在理解熊猫的多索引选择方面有问题 0 1 2 3 first second third C one mean 3 4 2 7 std 4 1 7 7 two mean 3 1 4 7 std 5 6 7 0 three mean 7 0 2 5 st

我在理解熊猫的多索引选择方面有问题

                    0  1  2  3
first second third            
C     one    mean   3  4  2  7
             std    4  1  7  7
      two    mean   3  1  4  7
             std    5  6  7  0
      three  mean   7  0  2  5
             std    7  3  7  1
H     one    mean   2  4  3  3
             std    5  5  3  5
      two    mean   5  7  0  6
             std    0  1  0  2
      three  mean   5  2  5  1
             std    9  0  4  6
V     one    mean   3  7  3  9
             std    8  7  9  3
      two    mean   1  9  9  0
             std    1  1  5  1
      three  mean   3  1  0  6
             std    6  2  7  4

我需要创建新行：

- 'CH' : ['CH',:,'mean'] => ['C',:,'mean'] - ['H',:,'mean']
- 'CH' : ['CH',:,'std'] => (['C',:,'std']**2 + ['H',:,'std']**2)**.5

尝试选择行时，会出现不同类型的错误： UnsortedIndexError:“多索引切片要求索引为完全lexsort元组len（3），lexsort depth（1）”索引

该如何执行此操作

import pandas as pd
import numpy as np
iterables = [['C', 'H', 'V'],
          ['one','two','three'],
          ['mean','std']]
midx = pd.MultiIndex.from_product(iterables, names=['first', 'second','third'])
chv = pd.DataFrame(np.random.randint(0,high=10,size=(18,4)), index=midx)
print (chv)
idx = pd.IndexSlice
chv.loc[:,idx['C',:,'mean']]

您可以先过滤，然后重命名第一级并使用算术运算，最后一起：

#avoid UnsortedIndexError
df = df.sort_index()

idx = pd.IndexSlice
c1 = chv.loc[idx['C',:,'mean'], :].rename({'C':'CH'}, level=0)
h1 = chv.loc[idx['H',:,'mean'], :].rename({'H':'CH'}, level=0)
ch1 = c1 - h1

c2 = chv.loc[idx['C',:,'std'], :].rename({'C':'CH'}, level=0)**2
h2 = chv.loc[idx['H',:,'std'], :].rename({'H':'CH'}, level=0)**2
ch2 = (c2 + h2)**.5

df = pd.concat([chv, ch1, ch2]).sort_index()

很好的解决方案！。无论如何，在我的实际案例中，解决方案没有按原样工作：我必须更改include columns=:rename（columns={'C'：'CH'}，level=0）。相反，调试器工作得很好。@Guido-可能在真实的数据帧中，列中有多个索引。在我的解决方案中，

.rename（{'C'：'CH'}，level=0）

与

相同。rename（index={'C'：'CH'}，level=0）

确实像你的评论

print (df)
                           0         1         2         3
first second third                                        
C     one    mean   7.000000  5.000000  8.000000  3.000000
             std    0.000000  4.000000  4.000000  4.000000
      three  mean   4.000000  2.000000  1.000000  6.000000
             std    8.000000  7.000000  3.000000  3.000000
      two    mean   1.000000  8.000000  2.000000  5.000000
             std    2.000000  2.000000  4.000000  2.000000
CH    one    mean   1.000000  2.000000  1.000000  2.000000
             std    4.000000  7.211103  4.000000  7.211103
      three  mean   1.000000  0.000000 -4.000000  2.000000
             std    8.062258  7.071068  4.242641  3.000000
      two    mean  -1.000000  6.000000 -2.000000  3.000000
             std    9.219544  7.280110  4.123106  2.000000
H     one    mean   6.000000  3.000000  7.000000  1.000000
             std    4.000000  6.000000  0.000000  6.000000
      three  mean   3.000000  2.000000  5.000000  4.000000
             std    1.000000  1.000000  3.000000  0.000000
      two    mean   2.000000  2.000000  4.000000  2.000000
             std    9.000000  7.000000  1.000000  0.000000
V     one    mean   9.000000  5.000000  0.000000  5.000000
             std    7.000000  9.000000  1.000000  1.000000
      three  mean   3.000000  0.000000  3.000000  4.000000
             std    1.000000  4.000000  9.000000  2.000000
      two    mean   3.000000  6.000000  3.000000  2.000000
             std    1.000000  3.000000  1.000000  4.000000