Python 从给定级别选择列的多索引列表_Python_Pandas_Dataframe_Multi Index

Python 从给定级别选择列的多索引列表

python pandas dataframe

Python 从给定级别选择列的多索引列表,python,pandas,dataframe,multi-index,Python,Pandas,Dataframe,Multi Index,如果我制作一个多索引列数据帧，如下所示： iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']] index = pd.MultiIndex.from_product(iterables, names=['first', 'second']) df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index) first bar

如果我制作一个多索引列数据帧，如下所示：

iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']]
index = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)


first        bar                 baz                 foo                 qux  \
second       one       two       one       two       one       two       one   
A      -0.119687 -0.518318  0.113920 -1.028505  1.106375 -1.020139 -0.039300   
B       0.123480 -2.091120  0.464597 -0.147211 -0.489895 -1.090659 -0.592679   
C      -1.174376  0.282011 -0.197658 -0.030751  0.117374  1.591109  0.796908   

first             
second       two  
A      -0.938209  
B      -0.851483  
C       0.442621

我只想使用列表从第一组列中选择列

select\u cols=['bar'，'qux']

结果是：

first        bar                  qux  
second       one       two        one        two
A      -0.119687 -0.518318  -0.039300  -0.938209    
B       0.123480 -2.091120  -0.592679  -0.851483    
C      -1.174376  0.282011   0.796908   0.442621

我该怎么做呢？（提前感谢）

您可以使用

loc

选择列：

df.loc[:, ["bar", "qux"]]

#  first       bar                    qux
# second       one        two         one         two
#      A  1.245525  -1.469999   -0.399174    0.017094
#      B -0.242284   0.835131   -0.400847   -0.344612
#      C -1.067006  -1.880113   -0.516234   -0.410847

您可以使用

loc

选择列：

df.loc[:, ["bar", "qux"]]

#  first       bar                    qux
# second       one        two         one         two
#      A  1.245525  -1.469999   -0.399174    0.017094
#      B -0.242284   0.835131   -0.400847   -0.344612
#      C -1.067006  -1.880113   -0.516234   -0.410847

简单的列选择也适用：

df[['bar', 'qux']]

# first        bar                 qux          
# second       one       two       one       two
# A       0.651522  0.480115 -2.924574  0.616674
# B      -0.395988  0.001643  0.358048  0.022727
# C      -0.317829  1.400970 -0.773148  1.549135

简单的列选择也适用：

df[['bar', 'qux']]

# first        bar                 qux          
# second       one       two       one       two
# A       0.651522  0.480115 -2.924574  0.616674
# B      -0.395988  0.001643  0.358048  0.022727
# C      -0.317829  1.400970 -0.773148  1.549135

当我找到这个Q/A时，我想我可能会看到一个打印列名的解决方案。弄明白了，我想我可以补充一下答案。下面打印出给定级别的列名值

df.columns.get_level_values(0)

=> ['bar', 'qux']

-E

当我找到这个Q/A时，我想我可能会看到一个打印列名的解决方案。弄明白了，我想我可以补充一下答案。下面打印出给定级别的列名值

df.columns.get_level_values(0)

=> ['bar', 'qux']

-E

@iparjono关于它为什么不起作用的更多原因或演示？我得到了错误

keyrerror:“[['bar'，qux']]不在[列]的所有部分中”

在v0.18.1上对我有效。@iparjono这可能是一个版本问题。我也在

0.18.1上，它也有效。@iparjono关于为什么它不起作用的更多原因或演示？我得到了错误keyrerror:“[['bar'，qux']]不在所有的[columns]中。
在v0.18.1上对我有效。@iparjono这可能是一个版本问题。我也在0.18.1
上，它工作正常。