Python 如何在时间段上过滤数据帧_Python_Python 3.x_Pandas_Datetime_Python Datetime

Python 如何在时间段上过滤数据帧

python python-3.x pandas datetime

Python 如何在时间段上过滤数据帧,python,python-3.x,pandas,datetime,python-datetime,Python,Python 3.x,Pandas,Datetime,Python Datetime,我有一个包含下表的文件： Name AvailableDate totalRemaining 0 X3321 2018-03-14 13:00:00 200 1 X3321 2018-03-14 14:00:00 200 2 X3321 2018-03-14 15:00:00 200 3 X3321 2018-03-14 16:00:00 200 4

我有一个包含下表的文件：

    Name        AvailableDate            totalRemaining
0   X3321       2018-03-14 13:00:00      200
1   X3321       2018-03-14 14:00:00      200
2   X3321       2018-03-14 15:00:00      200
3   X3321       2018-03-14 16:00:00      200
4   X3321       2018-03-14 17:00:00      193

我想返回一个数据帧，其中包含特定时间段内的所有记录，而不考虑实际的日期
我在这里举了一个例子：

但当我执行以下命令时：

## setup import pandas as pd import numpy as np ### Step 2 ### Check available slots file2 = r'C:\Users\user\Desktop\Files\data.xlsx' slots = pd.read_excel(file2,na_values='') ## filter the preferred ones slots['nextAvailableDate'] = pd.to_datetime((slots['nextAvailableDate'])) slots['times'] = pd.to_datetime((slots['nextAvailableDate'])) slots = slots[slots['times'].between('21:00:00', '02:00:00')]
这将返回空数据帧以及此解决方案：

slots = slots[slots['times'].dt.strftime('%H:%M:%S').between('21:00:00', '02:00:00')]
有没有一种方法可以在不单独创建时间列的情况下正确执行此操作？请问我应该如何处理这个问题
我的目标是：

Name AvailableDate totalRemaining 0 X3321 2018-03-14 21:00:00 200 1 X3321 2018-03-14 22:00:00 200 2 X3321 2018-03-14 23:00:00 200 3 X3321 2018-03-14 00:00:00 200 4 X3321 2018-03-14 01:00:00 193
对于数据集中出现的每一天。
我认为需要使用由创建的
Datetimeindex
，对于列，为相同的列顺序添加：

print (slots) Name AvailableDate totalRemaining 0 X3321 2018-03-14 21:00:00 200 1 X3321 2018-03-14 20:00:00 200 2 X3321 2018-03-14 22:00:00 200 3 X3321 2018-03-14 23:00:00 200 4 X3321 2018-03-14 00:00:00 200 5 X3321 2018-03-14 01:00:00 193 6 X3321 2018-03-14 13:00:00 200 7 X3321 2018-03-14 14:00:00 200 8 X3321 2018-03-14 15:00:00 200 9 X3321 2018-03-14 16:00:00 200 10 X3321 2018-03-14 17:00:00 193 slots['AvailableDate'] = pd.to_datetime(slots['AvailableDate']) df = (slots.set_index('AvailableDate') .between_time('21:00:00', '02:00:00') .reset_index() .reindex(columns=df.columns)) print (df) AvailableDate Name totalRemaining 0 2018-03-14 21:00:00 X3321 200 1 2018-03-14 22:00:00 X3321 200 2 2018-03-14 23:00:00 X3321 200 3 2018-03-14 00:00:00 X3321 200 4 2018-03-14 01:00:00 X3321 193

您可以将
pd.Series.between
与
datetime
对象一起使用，如下所示

from datetime import datetime start = datetime.strptime('21:00:00', '%H:%M:%S').time() end = datetime.strptime('02:00:00', '%H:%M:%S').time() slots = slots[slots['times'].dt.time.between(start, end)]
示例用法

from datetime import datetime import pandas as pd slots = pd.DataFrame({'times': ['2018-03-08 05:00:00', '2018-03-08 07:00:00', '2018-03-08 01:00:00', '2018-03-08 20:00:00', '2018-03-08 22:00:00', '2018-03-08 23:00:00']}) slots['times'] = pd.to_datetime(slots['times']) start = datetime.strptime('21:00:00', '%H:%M:%S').time() end = datetime.strptime('23:30:00', '%H:%M:%S').time() slots = slots[slots['times'].dt.time.between(start, end)] # times # 4 2018-03-08 22:00:00 # 5 2018-03-08 23:00:00

很好，谢谢。NameError:没有定义名称“df”，因为我没有将正确的df传递给columns=df.columns。现在一切都好了。很遗憾我不能同时奖励两个答案，你的strtime在我以后的项目中非常有用。非常感谢。