Python （熊猫）组合两个数据帧的不同方式_Python_Pandas_Datetime_Dataframe_Python Datetime

Python （熊猫）组合两个数据帧的不同方式

python pandas datetime dataframe

Python （熊猫）组合两个数据帧的不同方式,python,pandas,datetime,dataframe,python-datetime,Python,Pandas,Datetime,Dataframe,Python Datetime,我想知道是否有比下面更好的方法来组合两个数据帧 import pandas as pd #create ramdom data sets N = 50 df = pd.DataFrame({'date': pd.date_range('2000-1-1', periods=N, freq='H'), 'value': np.random.random(N)}) index = pd.DatetimeIndex(df['date']) peak_time = df.iloc[index.in

我想知道是否有比下面更好的方法来组合两个数据帧

import pandas as pd

#create ramdom data sets
N = 50
df = pd.DataFrame({'date': pd.date_range('2000-1-1', periods=N, freq='H'),
 'value': np.random.random(N)})

index = pd.DatetimeIndex(df['date'])
peak_time = df.iloc[index.indexer_between_time('7:00','9:00')]
lunch_time = df.iloc[index.indexer_between_time('12:00','14:00')]

comb_data = pd.concat([peak_time, lunch_time], ignore_index=True)

在使用逻辑运算符时，是否有方法将两个范围组合在一起

我必须使用它在df中创建一个名为“isPeak”的新列，其中1在7:00~9:00之间写入，如果不在12:00~14:00之间，则写入0。

对于我来说：

备选方案包括：

使用纯熊猫解决方案并将数组转换为

索引

：

idx = (pd.Index(index.indexer_between_time('7:00','9:00'))
         .union(pd.Index(index.indexer_between_time('12:00','14:00'))))

comb_data = df.iloc[idx]
print (comb_data)
                  date     value
7  2000-01-01 07:00:00  0.760627
8  2000-01-01 08:00:00  0.236474
9  2000-01-01 09:00:00  0.626146
12 2000-01-01 12:00:00  0.625335
13 2000-01-01 13:00:00  0.793105
14 2000-01-01 14:00:00  0.706873
31 2000-01-02 07:00:00  0.113688
32 2000-01-02 08:00:00  0.035565
33 2000-01-02 09:00:00  0.230603
36 2000-01-02 12:00:00  0.423155
37 2000-01-02 13:00:00  0.947584
38 2000-01-02 14:00:00  0.226181

如果我们不能使用numpy，我上面所做的事情应该没问题？如果使用pandas，将

np.union1d

更改为

pd.np.union1d

，如果

np

有问题。因为熊猫是建立在numpy:）和你的问题-是的，这是正确的解决方案。@SeoiMin-添加了纯熊猫版本。

idx = np.r_[index.indexer_between_time('7:00','9:00'), 
            index.indexer_between_time('12:00','14:00')]

comb_data = df.iloc[idx]
print (comb_data)
                  date     value
7  2000-01-01 07:00:00  0.760627
8  2000-01-01 08:00:00  0.236474
9  2000-01-01 09:00:00  0.626146
31 2000-01-02 07:00:00  0.113688
32 2000-01-02 08:00:00  0.035565
33 2000-01-02 09:00:00  0.230603
12 2000-01-01 12:00:00  0.625335
13 2000-01-01 13:00:00  0.793105
14 2000-01-01 14:00:00  0.706873
36 2000-01-02 12:00:00  0.423155
37 2000-01-02 13:00:00  0.947584
38 2000-01-02 14:00:00  0.226181

idx = (pd.Index(index.indexer_between_time('7:00','9:00'))
         .union(pd.Index(index.indexer_between_time('12:00','14:00'))))

comb_data = df.iloc[idx]
print (comb_data)
                  date     value
7  2000-01-01 07:00:00  0.760627
8  2000-01-01 08:00:00  0.236474
9  2000-01-01 09:00:00  0.626146
12 2000-01-01 12:00:00  0.625335
13 2000-01-01 13:00:00  0.793105
14 2000-01-01 14:00:00  0.706873
31 2000-01-02 07:00:00  0.113688
32 2000-01-02 08:00:00  0.035565
33 2000-01-02 09:00:00  0.230603
36 2000-01-02 12:00:00  0.423155
37 2000-01-02 13:00:00  0.947584
38 2000-01-02 14:00:00  0.226181