Python 根据时间戳将txt文件数据分段为24小时块
我有一个txt文件,格式如下:Python 根据时间戳将txt文件数据分段为24小时块,python,python-3.x,datetime,Python,Python 3.x,Datetime,我有一个txt文件,格式如下: Event A 15MAR18 103000 15MAR18 103758 Event A 16MAR18 120518 16MAR18 121308 Event B 16MAR18 121203 16MAR18 124543 Event B 16MAR18 134443 16MAR18 141823 Event B 16MAR18 15
Event A 15MAR18 103000 15MAR18 103758
Event A 16MAR18 120518 16MAR18 121308
Event B 16MAR18 121203 16MAR18 124543
Event B 16MAR18 134443 16MAR18 141823
Event B 16MAR18 151733 16MAR18 155103
Event B 17MAR18 165013 17MAR18 172343
Event B 17MAR18 182253 17MAR18 185623
Event B 17MAR18 195533 17MAR18 202903
Event A 17MAR18 203738 17MAR18 204028
Event B 18MAR18 212813 18MAR18 220143
Event A 18MAR18 221058 18MAR18 222338
Event B 18MAR18 230103 18MAR18 233423
Event A 19MAR18 234728 19MAR18 000048
Event B 20MAR18 003343 20MAR18 010703
Event A 20MAR18 012508 20MAR18 013418
Event B 21MAR18 020623 21MAR18 023943
Event B 21MAR18 033903 21MAR18 041223
Event B 21MAR18 051143 21MAR18 054503
Event B 21MAR18 064433 21MAR18 071743
Event A 22MAR18 074058 22MAR18 075008
Event B 22MAR18 081713 22MAR18 085023
Event A 23MAR18 091438 23MAR18 092738
Event B 23MAR18 094953 23MAR18 102303
Event A 23MAR18 105148 23MAR18 110418
我正在尝试根据24小时的时间增量和中间列来分隔文件
例如,带有15MAR18 103000的第一行将是它自己的单独列表
然后第二行将是一个不同的列表,因为timedelta大于24小时。它将从16MAR18 120518分组到16MAR18 151733。等等
我的尝试如下:
List_Segment_1 = []
with open('file.txt', 'r') as input_file:
input_file = input_file.readlines()
startTime = datetime.strptime(input_file[0][15:29], '%d%b%y %H%M%S')
endTime = startTime + timedelta(hours=24)
for line in input_file:
dates= datetime.strptime(line[15:29], '%d%b%y %H%M%S')
if startTime < dates < endTime:
List_Segment_1.append(line)
Event A 15MAR18 103000 15MAR18 103758 Segment1
Event A 16MAR18 120518 16MAR18 121308 Segment2
Event B 16MAR18 121203 16MAR18 124543 Segment2
Event B 16MAR18 134443 16MAR18 141823 Segment2
Event B 16MAR18 151733 16MAR18 155103 Segment2
Event B 17MAR18 165013 17MAR18 172343 Segment3
Event B 17MAR18 182253 17MAR18 185623 Segment3
Event B 17MAR18 195533 17MAR18 202903 Segment3
Event A 17MAR18 203738 17MAR18 204028 Segment3
Event B 18MAR18 212813 18MAR18 220143 Segment4
Event A 18MAR18 221058 18MAR18 222338 Segment4
Event B 18MAR18 230103 18MAR18 233423 Segment4
Event A 19MAR18 234728 19MAR18 000048 Segment5
Event B 20MAR18 003343 20MAR18 010703 Segment5
Event A 20MAR18 012508 20MAR18 013418 Segment5
Event B 21MAR18 020623 21MAR18 023943 Segment6
Event B 21MAR18 033903 21MAR18 041223 Segment6
Event B 21MAR18 051143 21MAR18 054503 Segment6
Event B 21MAR18 064433 21MAR18 071743 Segment6
Event A 22MAR18 074058 22MAR18 075008 Segment6
Event B 22MAR18 081713 22MAR18 085023 Segment7
Event A 23MAR18 091438 23MAR18 092738 Segment8
Event B 23MAR18 094953 23MAR18 102303 Segment8
Event A 23MAR18 105148 23MAR18 110418 Segment8
假设天数是01-31(不是1-31),我写了一个基于字符串切片的解决方案。但是您也可以将datetime与此逻辑结合使用
from pprint import pprint
with open('file.txt', 'r') as input_file:
input_file = input_file.readlines()
previous_day = 15 # first line of the file
segments = []
day_data = []
for line in input_file:
current_day = int(line[14:16])
if current_day > previous_day:
# new day
segments.append(day_data) # append before starting new list
day_data = []
day_data.append(str(line))
else:
day_data.append(str(line))
pprint(segments)
相当老式的代码,但工作。输出为字典
import datetime
mydict = {}
l_num = 1
with open('file.txt', 'r') as input_file:
input_file = input_file.readlines()
for i in range(len(input_file)):
if i == 0:
mydict['Segment ' + str(l_num)] = [input_file[i]]
else:
prevDate = datetime.datetime.strptime(input_file[i-1].split(' ')[1], '%d%b%y %H%M%S')
Date = datetime.datetime.strptime(input_file[i].split(' ')[1], '%d%b%y %H%M%S')
if Date - prevDate > datetime.timedelta(hours = 24):
l_num += 1
mydict['Segment ' + str(l_num)] = []
mydict['Segment ' + str(l_num)].append(input_file[i])
else:
mydict['Segment ' + str(l_num)].append(input_file[i])
刚刚注意到。我在用蟒蛇。我不确定它是否能在Python3中正常工作。不过,我希望是这样。这是对您的问题的简单实现,您应该根据需要修改它:
from datetime import datetime, timedelta
with open('file.txt', 'r') as input_file:
lines = input_file.readlines()
base_time = datetime.strptime(lines[0][14:28], '%d%b%y %H%M%S')
end_time = base_time + timedelta(hours=24)
segment = 1
for line in lines:
date = datetime.strptime(line[14:28], '%d%b%y %H%M%S')
if base_time <= date < end_time:
pass
else:
segment += 1
base_time = date
end_time = date + timedelta(hours=24)
print(line.strip() + '\tSegment {}'.format(segment))
您在
startTime…
行中缺少了一个,
,您也在使用python 2或3编辑,谢谢。是的,文件是按日期排序的吗?是的。这个文件是按日期排序的。你能添加一个输出外观的例子吗?非常感谢。。根据需要进行调整,并且可以正常工作。非常感谢!
Event A 15MAR18 103000 15MAR18 103758 Segment 1
Event A 16MAR18 120518 16MAR18 121308 Segment 2
Event B 16MAR18 121203 16MAR18 124543 Segment 2
Event B 16MAR18 134443 16MAR18 141823 Segment 2
Event B 16MAR18 151733 16MAR18 155103 Segment 2
Event B 17MAR18 165013 17MAR18 172343 Segment 3
Event B 17MAR18 182253 17MAR18 185623 Segment 3
Event B 17MAR18 195533 17MAR18 202903 Segment 3
Event A 17MAR18 203738 17MAR18 204028 Segment 3
Event B 18MAR18 212813 18MAR18 220143 Segment 4
Event A 18MAR18 221058 18MAR18 222338 Segment 4
Event B 18MAR18 230103 18MAR18 233423 Segment 4
Event A 19MAR18 234728 19MAR18 000048 Segment 5
Event B 20MAR18 003343 20MAR18 010703 Segment 5
Event A 20MAR18 012508 20MAR18 013418 Segment 5
Event B 21MAR18 020623 21MAR18 023943 Segment 6
Event B 21MAR18 033903 21MAR18 041223 Segment 6
Event B 21MAR18 051143 21MAR18 054503 Segment 6
Event B 21MAR18 064433 21MAR18 071743 Segment 6
Event A 22MAR18 074058 22MAR18 075008 Segment 7
Event B 22MAR18 081713 22MAR18 085023 Segment 7
Event A 23MAR18 091438 23MAR18 092738 Segment 8
Event B 23MAR18 094953 23MAR18 102303 Segment 8
Event A 23MAR18 105148 23MAR18 110418 Segment 8