Python 带xs功能的多列选择失败_Python_Pandas_Multiple Columns_Multi Index

Python 带xs功能的多列选择失败

python pandas

Python 带xs功能的多列选择失败,python,pandas,multiple-columns,multi-index,Python,Pandas,Multiple Columns,Multi Index,我有以下多索引时间序列数据 first 001 \ second open high low close jdiff_vol value date time 20150721 90100

我有以下多索引时间序列数据

first                001                                               \
second              open     high      low    close jdiff_vol   value   
date     time                                                           
20150721 90100   2082.18  2082.18  2082.18  2082.18     11970   99466   
         90200   2082.72  2083.01  2082.18  2083.01      4886   40108   
         90300   2083.68  2084.20  2083.68  2083.98      6966   48847   
         90400   2083.63  2084.21  2083.63  2084.00      6817   48020   
         90500   2084.03  2084.71  2083.91  2084.32     10193   58399   
20150721 90100   2084.14  2084.22  2083.59  2083.65      7860   39128   
         90200   2084.08  2084.08  2083.47  2083.50      7171   39147   
         90300   2083.25  2083.65  2083.08  2083.60      4549   34373   
         90400   2084.06  2084.06  2083.66  2083.80      6980   38088   
         90500   2083.61  2084.04  2083.27  2083.89      5292   33466

下面的代码可以工作

opens = data.xs('open', level='second', axis=1, drop_level=True)

但是，使用下面的代码选择多列失败

opens = data.xs(('open','close'), level='second', axis=1, drop_level=True)

如何修改它以选择多列？

我无法使用解决方案

但您可以使用，但首先必须按以下方式对列进行排序：

到目前为止，无法在同一级别使用带有两个列键的pandas xs（）函数。只能使用两个不同级别的键：

opens=data.xs（（'001'，'close'），level=（'first'，'second'），axis=1，drop_level=True）

然而，这并不是你想要的。另一种解决方案是执行两个pandas xs（）函数和concat之后执行它们：

df_xs = pd.concat([df.xs('open', level='second', axis=1, drop_level=True), df.xs('close', level='second', axis=1, drop_level=True)])

这是一个完整的例子。首先，创建一个数据帧：

import pandas as pd
import numpy as np

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(6, 6), index=index[:6], columns=index[:6])

print(df)

first              bar                 baz                 foo          
second             one       two       one       two       one       two
first second                                                            
bar   one     0.699065 -0.283550  0.072595 -0.699627  0.879832 -1.787520
      two    -1.172970  1.381607  1.941370  0.577451 -0.182819  0.215879
baz   one     0.669402 -0.018534  0.775114  1.277079  0.404116 -2.450712
      two     0.066530 -0.509366  1.249981  2.426217  0.409881 -0.178713
foo   one     1.098217  0.399427 -1.423057 -1.261542  1.668202  0.187629
      two     0.827283  0.974239 -1.944796  0.266321  0.700679 -0.371074

然后可以使用concat执行xs（）

df_xs = pd.concat([df.xs('one', level='second', axis=1, drop_level=True), df.xs('two', level='second', axis=1, drop_level=True)])
print (df_xs)

first              bar       baz       foo
first second                              
bar   one     0.699065  0.072595  0.879832
      two    -1.172970  1.941370 -0.182819
baz   one     0.669402  0.775114  0.404116
      two     0.066530  1.249981  0.409881
foo   one     1.098217 -1.423057  1.668202
      two     0.827283 -1.944796  0.700679
bar   one    -0.283550 -0.699627 -1.787520
      two     1.381607  0.577451  0.215879
baz   one    -0.018534  1.277079 -2.450712
      two    -0.509366  2.426217 -0.178713
foo   one     0.399427 -1.261542  0.187629
      two     0.974239  0.266321 -0.371074

例如：

df = pd.DataFrame(
    [[1,2,3,4,5,6,7,8]],
    columns=pd.MultiIndex.from_product([['A','B'], ['a', 'b', 'c', 'd']])
)

Out:
A               B
a   b   c   d   a   b   c   d
1   2   3   4   5   6   7   8

我们要选择列

和

Out:
A       B
a   b   a   b
1   2   5   6

解决方案1：积极选择（与耶斯雷尔的想法相同）使用搜索列的位置并选择它们

select = df.columns.get_level_values(1).isin(['a', 'b'])
df.loc[:, select]

解决方案2：消极选择为了解决这个问题，可以更方便地不选择感兴趣的列，而是使用删除不需要的列。它允许批量删除多个列

要选择

和

，请删除

和

df.drop(['c', 'd'], level=1, axis=1)

您似乎缺少一个逗号

opens=data.xs（（'open'，'close'），level='second'，axis=1，drop_level=True）

这是打字错误吗？@EdChum感谢您的评论。我修正了打字错误。

df.drop(['c', 'd'], level=1, axis=1)