Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/333.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 向多索引数据帧的特定级别添加新索引_Python_Pandas_Dataframe_Multi Index - Fatal编程技术网

Python 向多索引数据帧的特定级别添加新索引

Python 向多索引数据帧的特定级别添加新索引,python,pandas,dataframe,multi-index,Python,Pandas,Dataframe,Multi Index,下面是我正在尝试做的一个例子: import io import pandas as pd data = io.StringIO('''Fruit,Color,Count,Price Apple,Red,3,$1.29 Apple,Green,9,$0.99 Pear,Red,25,$2.59 Pear,Green,26,$2.79 Lime,Green,99,$0.39 ''') df_unindexed = pd.read_csv(data) df = df_unindexed.set_in

下面是我正在尝试做的一个例子:

import io
import pandas as pd
data = io.StringIO('''Fruit,Color,Count,Price
Apple,Red,3,$1.29
Apple,Green,9,$0.99
Pear,Red,25,$2.59
Pear,Green,26,$2.79
Lime,Green,99,$0.39
''')
df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Fruit', 'Color'])
输出:

Out[5]: 
             Count  Price
Fruit Color              
Apple Red        3  $1.29
      Green      9  $0.99
Pear  Red       25  $2.59
      Green     26  $2.79
Lime  Green     99  $0.39
现在让我们假设我想计算“颜色”级别中的关键点数量:

L = []
for i in pd.unique(df.index.get_level_values(0)):
    L.append(range(df.xs(i).shape[0]))

list(np.concatenate(L))
然后,我将结果列表
[0,1,0,1,0]
添加为一个新列:

df['Bob'] = list(np.concatenate(L))
因此:

             Count  Price  Bob
Fruit Color                   
Apple Red        3  $1.29    0
      Green      9  $0.99    1
Pear  Red       25  $2.59    0
      Green     26  $2.79    1
Lime  Green     99  $0.39    0
我的问题:

如何使
Bob
列成为与
Color
相同级别的索引?这就是我想要的:

                 Count  Price
Fruit Color Bob                   
Apple Red    0    3     $1.29
      Green  1    9     $0.99
Pear  Red    0   25     $2.59
      Green  1   26     $2.79
Lime  Green  0   99     $0.39

IIUC,使用
set\u index
append
参数:

df.set_index('Bob',append=True,inplace=True)
>>> df
                 Count  Price
Fruit Color Bob              
Apple Red   0        3  $1.29
      Green 1        9  $0.99
Pear  Red   0       25  $2.59
      Green 1       26  $2.79
Lime  Green 0       99  $0.39

您正在寻找
cumcount
?如果是这样,您可以放弃循环并将解决方案矢量化

df = df.set_index(df.groupby(level=0).cumcount(), append=True)
print(df)
               Count  Price
Fruit Color                
Apple Red   0      3  $1.29
      Green 1      9  $0.99
Pear  Red   0     25  $2.59
      Green 1     26  $2.79
Lime  Green 0     99  $0.39
或者,如果你想一下子做到这一点

df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Fruit', 'Color', df.groupby('Fruit').cumcount()])
print(df)
               Count  Price
Fruit Color                
Apple Green 0      9  $0.99
      Red   1      3  $1.29
Lime  Green 0     99  $0.39
Pear  Green 1     26  $2.79
      Red   0     25  $2.59
要重命名索引,请使用
rename\u axis

df = df.rename_axis(['Fruit', 'Color', 'Bob'])
print(df)
                 Count  Price
Fruit Color Bob              
Apple Red   0        3  $1.29
      Green 1        9  $0.99
Pear  Red   0       25  $2.59
      Green 1       26  $2.79
Lime  Green 0       99  $0.39

该死!我希望这会比那更难。thxBut您现在如何访问cumcount索引?或者你怎么给它起个像“鲍勃”这样的名字?