Python 设置多索引系列中的值_Python_Pandas_Indexing_Series_Multi Index

Python 设置多索引系列中的值

python pandas indexing

Python 设置多索引系列中的值,python,pandas,indexing,series,multi-index,Python,Pandas,Indexing,Series,Multi Index,我试图将一个系列设置为另一个系列，以多索引值。如果没有复杂的技巧，我找不到在熊猫身上做这件事的方法我的原创系列： one 1 0.522764 3 0.362663 7 0.963108 two 2 0.717855 4 0.004645 5 0.077471 我想要在三级连接的数据： 2 0.8 7 0.9 8 0.7 所需输出： one 1 0.522764 3

我试图将一个系列设置为另一个系列，以多索引值。如果没有复杂的技巧，我找不到在熊猫身上做这件事的方法

我的原创系列：

one  1    0.522764
     3    0.362663
     7    0.963108
two  2    0.717855
     4    0.004645
     5    0.077471

我想要在三级连接的数据：

2    0.8
7    0.9
8    0.7

所需输出：

one    1    0.522764
       3    0.362663
       7    0.963108
two    2    0.717855
       4    0.004645
       5    0.077471
three  2    0.800000
       7    0.900000
       8    0.700000

我想不出一个优雅的方法在熊猫身上做到这一点。我所能做的就是进行以下黑客攻击：

# imports
import numpy as np
import pandas as pd 

# to replicate the Series: 
np.arrays = [['one','one','one','two','two','two'],[1,3,7,2,4,5]]
my_series = pd.Series([np.random.random() for i in range(6)],
               index=pd.MultiIndex.from_tuples(list(zip(*np.arrays))))

# the new data I need to add: 
new_data = pd.Series({1: .9, 2: .7, 3: .8})

以下是我目前解决问题的方法：

# rename the index so that I can call it later 
new_data.index.name = 'level_1' 

# turn it into temporary a dataframe so that I can add a new column 
temp = pd.DataFrame(new_data) 

# create a new column with the desired name for first index level 
temp['level_0'] = 'three'   

# reset index, set the new index, turn into Series again
temp = temp.reset_index().set_index(['level_0', 'level_1'])[0]                              

# append it to the larger dataframe 
my_series = my_series.append(temp)

这将产生所需的输出

问题：在熊猫身上有没有一种简单、优雅的方法可以做到这一点？你可以尝试使用

pd.concat

：

u = (new_data.to_frame()
             .assign(_='three')
             .set_index(['_', new_data.index])[0])
pd.concat([df, u])

one    1    0.618472
       3    0.026207
       7    0.766849
two    2    0.651633
       4    0.282038
       5    0.160714
three  1    0.900000
       2    0.700000
       3    0.800000
dtype: float64

如果一开始就为

新数据创建了一个等效的多索引

，则可以直接将

系列

与

pd.concat

连接起来，而无需强制转换到

数据帧

，如下所示：

new_series = pd.Series([0.8,0.9,0.7],
              index=pd.MultiIndex.from_tuples([('three',x) for x in range(1,4)])
            )
pd.concat([my_series,new_series]) #note OP changed name of orig series from df to my_series
#==============================================================================
# one    1    0.236158
#        3    0.699102
#        7    0.421937
# two    2    0.887081
#        4    0.520304
#        5    0.211461
# three  1    0.800000
#        2    0.900000
#        3    0.700000
# dtype: float64
#==============================================================================

type(pd.concat([my_series,new_series])) # pandas.core.series.Series

选项1

pd.concat

是一种使用

键

参数预先添加索引或列级别的简便方法。再加上第二个

pd.concat

，完成任务

pd.concat([my_series, pd.concat([new_data], keys=['Three'])])

one    1    0.943246
       3    0.412200
       7    0.379641
two    2    0.883960
       4    0.182983
       5    0.773227
Three  1    0.900000
       2    0.700000
       3    0.800000
dtype: float64

选项2
或者，我们可以在

索引

参数中插入一个附加数组的同时构造一个新的序列。再次使用

pd.concat

组合注意我本可以使用

pd.MultiIndex.from_array

但是语法简化了，只需将数组直接传递给

index

参数

pd.concat([
    my_series,
    pd.Series(new_data.values, [['Three'] * new_data.size, new_data.index])
])

one    1    0.943246
       3    0.412200
       7    0.379641
two    2    0.883960
       4    0.182983
       5    0.773227
Three  1    0.900000
       2    0.700000
       3    0.800000
dtype: float64

选项3
用多索引重建序列的另一种方法。这一个使用了来自产品的pd.MultiIndex.from

pd.concat([
    my_series,
    pd.Series(new_data.values, pd.MultiIndex.from_product([['Three'], new_data.index]))
])

one    1    0.943246
       3    0.412200
       7    0.379641
two    2    0.883960
       4    0.182983
       5    0.773227
Three  1    0.900000
       2    0.700000
       3    0.800000
dtype: float64

您的符号有点混乱，因为

df

不是一个数据帧…相关：？@C8H10N4O2捕捉得很好，已修复。我们的答案有帮助吗？@C8H10N4O2

df

有误导性。@C8H10N4O2祝您好运，我自己正在寻找一个比这更好的解决方案，但似乎找不到。看起来不错，虽然OP可能想要一些不需要首先声明多索引的东西，但我不确定这是可能的。哦！我知道

keys

参数是有用的。这很聪明。我相信我应该在24小时内拿到我的银熊猫徽章。感谢大家的支持和很久以前的鼓舞人心的讲话。我不能忘记。