Python 用N个新级别为每个索引扩展一个多索引？_Python_Pandas_Scipy

Python 用N个新级别为每个索引扩展一个多索引？

python pandas

Python 用N个新级别为每个索引扩展一个多索引？,python,pandas,scipy,Python,Pandas,Scipy,我经常遇到这样的情况，我有一个熊猫多索引，级别如下： ix = pd.MultiIndex.from_tuples(((1, 2), (1, 3), (2, 2), (2, 5)), names=['hi', 'there']) a = pd.DataFrame([0]*4, index=ix, colum

我经常遇到这样的情况，我有一个熊猫多索引，级别如下：

ix = pd.MultiIndex.from_tuples(((1, 2),
                                (1, 3),
                                (2, 2),
                                (2, 5)), names=['hi', 'there'])
a = pd.DataFrame([0]*4, index=ix, columns=['foo'])

                  foo
hi there newix     
1  2     1        0
         2        0
   3     1        0
         2        0
2  2     1        0
         2        0
   5     1        0
         2        0

在这种结构中：

print a
          foo
hi there   
1  2      0
   3      0
2  2      0
   5      0

然而，我想扩展这些指数，比如说，每个级别增加3个新指数。因此，我想添加另一个索引，使最终产品如下所示：

ix = pd.MultiIndex.from_tuples(((1, 2),
                                (1, 3),
                                (2, 2),
                                (2, 5)), names=['hi', 'there'])
a = pd.DataFrame([0]*4, index=ix, columns=['foo'])

                  foo
hi there newix     
1  2     1        0
         2        0
   3     1        0
         2        0
2  2     1        0
         2        0
   5     1        0
         2        0

我想不出一个明显的方法来做到这一点，使用类似“from_product”的东西。我想我可以通过迭代前两行来手动构造元组，但这似乎很麻烦。有没有比我想象的更优雅的方法来实现这一点

编辑：理想情况下，这不是，比如：

newixs = []
for ix in a.index:
    for i in range(5):
        nix = list(ix) + [i]
        newixs.append(nix)

这是可行的（使用from_tuples创建pandas多索引），但对我来说似乎有点麻烦：p

我首先使用concat创建一个更大的数据帧：

In [11]: res = pd.concat([a, a])

In [12]: res
Out[12]: 
          foo
hi there     
1  2        0
   3        0
2  2        0
   5        0
1  2        0
   3        0
2  2        0
   5        0

我认为添加新索引的最简单方法是添加一个新列，然后

设置索引

：

In [13]: res['newix'] = np.repeat([1, 2], len(a))

In [14]: res
Out[14]: 
          foo  newix
hi there            
1  2        0      1
   3        0      1
2  2        0      1
   5        0      1
1  2        0      2
   3        0      2
2  2        0      2
   5        0      2

In [15]: res.set_index('newix', append=True)
Out[15]: 
                foo
hi there newix     
1  2     1        0
   3     1        0
2  2     1        0
   5     1        0
1  2     2        0
   3     2        0
2  2     2        0
   5     2        0

这基本上就是您想要的（如果需要，您可以

res.sort_index（）

。

您只需在目标索引ix3上使用重新索引（使用隐式广播）：

ix3 = pd.MultiIndex.from_tuples(
    [(1, 2, 1), (1, 2, 2),
     (1, 3, 1), (1, 3, 2),
     (2, 2, 1), (2, 2, 2),
     (2, 5, 1), (2, 5, 2)],
    names=['hi', 'there', 'newix'])

a.reindex(ix3)    
                   foo
hi  there   newix   
1   2       1      0
            2      0
    3       1      0
            2      0
2   2       1      0
            2      0
    5       1      0
            2      0