Python 使用多索引切片时出现KeyError_Python_Python 3.x_Pandas_Dataframe

Python 使用多索引切片时出现KeyError

python python-3.x pandas dataframe

Python 使用多索引切片时出现KeyError,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,虽然我能够绕过这个问题，但我想了解为什么会发生这个错误。。数据帧 import pandas as pd import itertools sl_df=pd.DataFrame( data=list(range(18)), index=pd.MultiIndex.from_tuples( list(itertools.product( ['A','B','C'], ['I','II','III'],

虽然我能够绕过这个问题，但我想了解为什么会发生这个错误。。 数据帧

import pandas as pd
import itertools

sl_df=pd.DataFrame(
    data=list(range(18)), 
    index=pd.MultiIndex.from_tuples(
        list(itertools.product(
            ['A','B','C'],
            ['I','II','III'],
            ['x','y']))),
    columns=['one'])

输出：

有效的简单切片

sl_df.loc[pd.IndexSlice['A',:,'x']]

输出：

引发错误的部分：

sl_df.loc[pd.IndexSlice[:,'II']]

输出：

问题：为什么只有在多索引的第一级使用“：”时，才必须在轴1上指定“：”？您是否同意它在其他级别上工作，但在多索引的第一个级别上不工作，这有点奇怪（请参阅上面的简单切片工作）？

因为多索引位于这样的df上

[（A，I，x），（A，I，y）…（C，III，x），（C，III，y）]

从熊猫文档的当前版本来看，使用切片器进行索引似乎需要在

.loc

方法中指定两个轴

其基本原理是，如果不指定两个轴，则选择轴的方向可能不明确

我不知道pandas内部结构是如何工作的，但在您的特定情况下，当您编写

sl_df.loc[pd.indexlice[：，'II']]

时，

：

被分派到行轴（即选择所有行）并将

'II'

发送到列，因此出现错误：

keyrerror:u'标签[II]不在[columns]中“

如果您暗示索引有3个级别，

pd.indexlice['A'，'I']

可以工作，但是

pd.indexlice[：，'I']

不起作用。我理解的问题是，如果我们执行

df.loc[pd.indexlice[：，'I']，：]

，则后者将起作用。明确询问轴1（列）上的所有内容可以解决问题。

         one
A I   x    0
  II  x    2
  III x    4

sl_df.loc[pd.IndexSlice[:,'II']]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-4bfd2d65fd21> in <module>()
----> 1 sl_df.loc[pd.IndexSlice[:,'II']]

...\pandas\core\indexing.pyc in __getitem__(self, key)
   1470             except (KeyError, IndexError):
   1471                 pass
-> 1472             return self._getitem_tuple(key)
   1473         else:
   1474             # we by definition only have the 0th axis

...\pandas\core\indexing.pyc in _getitem_tuple(self, tup)
    868     def _getitem_tuple(self, tup):
    869         try:
--> 870             return self._getitem_lowerdim(tup)
    871         except IndexingError:
    872             pass

...\pandas\core\indexing.pyc in _getitem_lowerdim(self, tup)
    977         # we may have a nested tuples indexer here
    978         if self._is_nested_tuple_indexer(tup):
--> 979             return self._getitem_nested_tuple(tup)
    980
    981         # we maybe be using a tuple to represent multiple dimensions here

...\pandas\core\indexing.pyc in _getitem_nested_tuple(self, tup)
   1056
   1057             current_ndim = obj.ndim
-> 1058             obj = getattr(obj, self.name)._getitem_axis(key, axis=axis)
   1059             axis += 1
   1060

...\pandas\core\indexing.pyc in _getitem_axis(self, key, axis)
   1909
   1910         # fall thru to straight lookup
-> 1911         self._validate_key(key, axis)
   1912         return self._get_label(key, axis=axis)
   1913

...\pandas\core\indexing.pyc in _validate_key(self, key, axis)
   1796                 raise
   1797             except:
-> 1798                 error()
   1799
   1800     def _is_scalar_access(self, key):

...\pandas\core\indexing.pyc in error()
   1783                 raise KeyError(u"the label [{key}] is not in the [{axis}]"
   1784                                .format(key=key,
-> 1785                                        axis=self.obj._get_axis_name(axis)))
   1786
   1787             try:

KeyError: u'the label [II] is not in the [columns]'

sl_df.loc[pd.IndexSlice[:,'II'],:]

        one
A II x    2
     y    3
B II x    8
     y    9
C II x   14
     y   15