Python 在多索引数据框中选择特定的低级列_Python_Pandas_Multi Index

Python 在多索引数据框中选择特定的低级列

python pandas

Python 在多索引数据框中选择特定的低级列,python,pandas,multi-index,Python,Pandas,Multi Index,我不知道这个问题是否得到了再次回答，但我没有发现任何类似的问题我有一个具有两级列的多索引数据框，例如： arrays = [np.array(['bar', 'bar','bar', 'foo', 'foo','foo', 'qux', 'qux', 'qux']), np.array(['one', 'two', 'three', 'one', 'two', 'three', 'one', 'two','three'])] df = pd.DataFrame(np.ra

我不知道这个问题是否得到了再次回答，但我没有发现任何类似的问题

我有一个具有两级列的多索引数据框，例如：

arrays = [np.array(['bar', 'bar','bar', 'foo', 'foo','foo', 'qux', 'qux', 'qux']),
          np.array(['one', 'two', 'three', 'one', 'two', 'three', 'one', 'two','three'])]

df = pd.DataFrame(np.random.randn(3, 9), columns=arrays)
print(df)

        bar                           foo                           qux  \
        one       two     three       one       two     three       one   
0  1.255724 -0.692387 -1.485324  2.265736  0.494645  1.973369 -0.326260   
1 -0.903874  0.695460 -0.950076  0.181590 -2.345611  1.288061  0.980166   
2 -0.294882  1.034745  1.423288 -0.895625 -0.847338  0.470444  0.373579   


        two     three  
0  0.136427 -0.136479  
1  0.702732 -1.894376  
2  0.506240 -0.456519

我想为每个第一级列分别从第二级选择特定列

例如，我希望得到如下结果：

        bar                 foo       qux          
        one       two       two       one     three
0  1.255724 -0.692387  0.494645 -0.326260 -0.136479
1 -0.903874  0.695460 -2.345611  0.980166 -1.894376
2 -0.294882  1.034745 -0.847338  0.373579 -0.456519

level0 = ['bar','foo','qux']
level1 = [['one','two'],['two'],['one','three']]

df_list=[]
for i,value in enumerate(level0):
    df_list.append(df.loc[:,(value,level1[i])])
new_df = pd.concat([i for i in df_list],axis=1)
print(new_df)

我已经看到了这一点，但这不是我想要实现的

现在我是这样做的：

        bar                 foo       qux          
        one       two       two       one     three
0  1.255724 -0.692387  0.494645 -0.326260 -0.136479
1 -0.903874  0.695460 -2.345611  0.980166 -1.894376
2 -0.294882  1.034745 -0.847338  0.373579 -0.456519

level0 = ['bar','foo','qux']
level1 = [['one','two'],['two'],['one','three']]

df_list=[]
for i,value in enumerate(level0):
    df_list.append(df.loc[:,(value,level1[i])])
new_df = pd.concat([i for i in df_list],axis=1)
print(new_df)

但在我看来，这并不是最好的解决办法

有没有更好的（更多的“熊猫”）方法来解决这个问题？

您可以先选择列，然后使用列提取，而不是连接数据

cols = pd.concat([pd.DataFrame({'level_0':x, 'level_1':y}) 
                  for x,y in zip(level0,level1)]
                ).values

df[cols]

输出：

        bar                 foo       qux          
        one       two       two       one     three
0  0.729061 -0.876547  0.312557  0.736568  0.250469
1  0.619194  0.451023  0.803252 -1.636403 -0.854607
2  0.254690 -1.054859 -1.223274  0.398411 -1.448396

df.loc[：，[（l0，l1）表示l0，l1在zip中（level0，level1）表示l1在zip中（level0，level1）表示l1在产品中（[x]，y）]

df.loc[：，[（l0，l1）表示x，y在zip中（level0，level1）表示l0，l1在产品中（[x]，y）]