Python 按日期切片多索引数据帧_Python_Pandas_Dataframe_Slice_Multi Index

Python 按日期切片多索引数据帧

python pandas dataframe

Python 按日期切片多索引数据帧,python,pandas,dataframe,slice,multi-index,Python,Pandas,Dataframe,Slice,Multi Index,假设我有以下多索引数据帧： arrays = [np.array(['bar', 'bar', 'bar', 'bar', 'foo', 'foo', 'foo', 'foo']), pd.to_datetime(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04'])] df = pd.DataFrame(

假设我有以下多索引数据帧：

arrays = [np.array(['bar', 'bar', 'bar', 'bar', 'foo', 'foo', 'foo', 'foo']),
          pd.to_datetime(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04', '2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04'])]
df = pd.DataFrame(np.zeros((8, 4)), index=arrays)

                 0    1    2    3
bar 2020-01-01  0.0  0.0  0.0  0.0
    2020-01-02  0.0  0.0  0.0  0.0
    2020-01-03  0.0  0.0  0.0  0.0
    2020-01-04  0.0  0.0  0.0  0.0
foo 2020-01-01  0.0  0.0  0.0  0.0
    2020-01-02  0.0  0.0  0.0  0.0
    2020-01-03  0.0  0.0  0.0  0.0
    2020-01-04  0.0  0.0  0.0  0.0

如何仅选择此数据帧中第一个索引

level='bar'

和

date>2020.01.02

的部分，以便在此部分中添加1

更清楚地说，预期产出将是：

                 0    1    2    3
bar 2020-01-01  0.0  0.0  0.0  0.0
    2020-01-02  0.0  0.0  0.0  0.0
    2020-01-03  1.0  1.0  1.0  1.0
    2020-01-04  1.0  1.0  1.0  1.0
foo 2020-01-01  0.0  0.0  0.0  0.0
    2020-01-02  0.0  0.0  0.0  0.0
    2020-01-03  0.0  0.0  0.0  0.0
    2020-01-04  0.0  0.0  0.0  0.0

我根据第一个索引对其进行了切片：

df.loc['bar']

但是，我无法在日期上应用条件。

这里可以比较每个级别，然后设置

，其中的所有列都有

：

：

m1 = df.index.get_level_values(0) =='bar' 
m2 = df.index.get_level_values(1) > '2020-01-02'

df.loc[m1 & m2, :] = 1
print (df)

                  0    1    2    3
bar 2020-01-01  0.0  0.0  0.0  0.0
    2020-01-02  0.0  0.0  0.0  0.0
    2020-01-03  1.0  1.0  1.0  1.0
    2020-01-04  1.0  1.0  1.0  1.0
foo 2020-01-01  0.0  0.0  0.0  0.0
    2020-01-02  0.0  0.0  0.0  0.0
    2020-01-03  0.0  0.0  0.0  0.0
    2020-01-04  0.0  0.0  0.0  0.0

#give ur index names :
df.index = df.index.set_names(["names","dates"])

#get the indices that match ur condition
index = df.query('names=="bar" and dates>"2020-01-02"').index

#assign 1 to the relevant points
#IndexSlice makes slicing multiindexes easier ... here though, it might be seen as overkill
idx = pd.IndexSlice
df.loc[idx[index],:] = 1


                 0  1   2   3
names   dates               
bar 2020-01-01  0.0 0.0 0.0 0.0
    2020-01-02  0.0 0.0 0.0 0.0
    2020-01-03  1.0 1.0 1.0 1.0
    2020-01-04  1.0 1.0 1.0 1.0
foo 2020-01-01  0.0 0.0 0.0 0.0
    2020-01-02  0.0 0.0 0.0 0.0
    2020-01-03  0.0 0.0 0.0 0.0
    2020-01-04  0.0 0.0 0.0 0.0