Python 删除日期不为';不属于任何日期范围
因此,我有一个两列数据框Python 删除日期不为';不属于任何日期范围,python,date,pandas,Python,Date,Pandas,因此,我有一个两列数据框datetime和value,我想删除所有不属于至少一个日期范围的行 例如,假设我的有效日期范围表示为元组 valid_date_ranges = [ (2017-01-01 00:00:00.00, 2017-01-03 15:00:00.00), (2017-01-04 03:25:00.00, 2017-01-06 22:56:00.00),
datetime
和value
,我想删除所有不属于至少一个日期范围的行
例如,假设我的有效日期范围表示为元组
valid_date_ranges = [
(2017-01-01 00:00:00.00, 2017-01-03 15:00:00.00),
(2017-01-04 03:25:00.00, 2017-01-06 22:56:00.00),
...
]
我有一个这样的数据框
datetime value
2017-01-01 00:00:00.00 1234
2017-01-01 00:01:00.00 13241526
2017-01-01 10:02:00.00 356356
2017-01-01 10:03:00.00 17435
2017-01-01 10:04:00.00 5234515
2017-01-01 10:05:00.00 52452435
...
2017-01-03 14:59:00.00 156256
2017-01-03 15:00:00.00 665654
2017-01-03 15:01:00.00 890656 *
2017-01-03 15:02:00.00 698765 *
2017-01-03 15:03:00.00 6574 *
...
2017-01-04 03:23:00.00 6541632 *
2017-01-04 03:24:00.00 1234 *
2017-01-04 03:25:00.00 4657347
2017-01-04 03:26:00.00 765
2017-01-04 03:27:00.00 870089
...
我想删除末尾带有星星的行,因为它们不属于任何日期范围。这里有一种方法:
# sample df and ranges to exclude, per OP
datetime value
"2017-01-01 00:00:00.00" 1234
"2017-01-01 00:01:00.00" 13241526
"2017-01-01 10:02:00.00" 356356
"2017-01-01 10:03:00.00" 17435
"2017-01-01 10:04:00.00" 5234515
"2017-01-01 10:05:00.00" 52452435
"2017-01-03 14:59:00.00" 156256
"2017-01-03 15:00:00.00" 665654
"2017-01-03 15:01:00.00" 890656
"2017-01-03 15:02:00.00" 698765
"2017-01-03 15:03:00.00" 6574
"2017-01-04 03:23:00.00" 6541632
"2017-01-04 03:24:00.00" 1234
"2017-01-04 03:25:00.00" 4657347
"2017-01-04 03:26:00.00" 765
"2017-01-04 03:27:00.00" 870089
df = pd.read_clipboard(parse_dates=True, index_col='datetime')
valid_date_ranges = [("2017-01-01 00:00:00.00", "2017-01-03 15:00:00.00"),
("2017-01-04 03:25:00.00", "2017-01-06 22:56:00.00")]
dranges = [pd.date_range(start, end, freq='min') for start, end in valid_date_ranges]
现在,删除在日期范围中找不到的行,并在新的筛选的数据框中构建剩余行:
filtered = pd.DataFrame()
for drange in dranges:
filtered = pd.concat([filtered,df.drop(df.index[~df.index.isin(drange)])])
print(filtered)
value
datetime
2017-01-01 00:00:00 1234
2017-01-01 00:01:00 13241526
2017-01-01 10:02:00 356356
2017-01-01 10:03:00 17435
2017-01-01 10:04:00 5234515
2017-01-01 10:05:00 52452435
2017-01-03 14:59:00 156256
2017-01-03 15:00:00 665654
2017-01-04 03:25:00 4657347
2017-01-04 03:26:00 765
2017-01-04 03:27:00 870089
这是另一种方法
制作数据帧
将行标记为*
输出:
value star
datetime
2017-01-01 00:00:00.00 1234 *
2017-01-03 00:01:00.00 1324 *
2017-01-03 15:00:00.00 1526 *
2017-01-04 01:03:00.00 356356 NaN
2017-01-04 02:03:00.00 17435 *
2017-01-04 03:25:00.00 5234 *
2017-01-06 22:56:00.00 515 *
2017-01-06 23:56:00.00 52452435 NaN
value star
datetime
2017-01-04 01:03:00.00 356356 NaN
2017-01-06 23:56:00.00 52452435 NaN
删除带有星星的行
输出:
value star
datetime
2017-01-01 00:00:00.00 1234 *
2017-01-03 00:01:00.00 1324 *
2017-01-03 15:00:00.00 1526 *
2017-01-04 01:03:00.00 356356 NaN
2017-01-04 02:03:00.00 17435 *
2017-01-04 03:25:00.00 5234 *
2017-01-06 22:56:00.00 515 *
2017-01-06 23:56:00.00 52452435 NaN
value star
datetime
2017-01-04 01:03:00.00 356356 NaN
2017-01-06 23:56:00.00 52452435 NaN
根据您指定的排除范围,看起来2017-01-04 03:24:00.00 1234
也应该是带星号的行。@andrew_reece已编辑,谢谢。我想将行保留在日期范围内,而不是删除它们。抱歉,这是一个打字错误-请注意输出是正确的。我也计算了这么多
value star
datetime
2017-01-04 01:03:00.00 356356 NaN
2017-01-06 23:56:00.00 52452435 NaN