如何在Python中检测日期是否连续?
我有一个带有“日期”字段的访问表。每个记录都有随机日期。我构建了一个脚本,将所有记录附加到列表中,然后将列表设置为仅过滤出唯一值:如何在Python中检测日期是否连续?,python,date,date-range,Python,Date,Date Range,我有一个带有“日期”字段的访问表。每个记录都有随机日期。我构建了一个脚本,将所有记录附加到列表中,然后将列表设置为仅过滤出唯一值: dateList = [] # cursor search through each record and append all records in the date # field to a python list for row in rows: dateList.append(row.getValue("DATE_OBSERVATION").strf
dateList = []
# cursor search through each record and append all records in the date
# field to a python list
for row in rows:
dateList.append(row.getValue("DATE_OBSERVATION").strftime('%m-%d-%Y'))
# Filter unique values to a set
newList = list(set(dateList))
这将在我的测试表上返回:
['07-06-2010','06-24-2010','07-05-2010','06-25-2010']
现在我有了DATE_OBSERVATION字段的唯一值,我想检测一下:
日期是单一的,即只返回一个唯一的日期,因为这是每个记录中的日期
如果日期是一个日期范围,即所有日期都在一个连续的范围内
如果日期是多个日期,但不在连续日期的范围内
任何建议都将不胜感激!
Mike您可以使用datetime对象的.toordinal方法简单地将日期对象转换为整数,而不是滚动自己的连续函数。顺序日期集的最大值和最小值之间的差值比该集的长度大一倍:
from datetime import datetime
date_strs = ['07-06-2010', '06-24-2010', '07-05-2010', '06-25-2010']
# date_strs = ['02-29-2012', '02-28-2012', '03-01-2012']
# date_strs = ['01-01-2000']
dates = [datetime.strptime(d, "%m-%d-%Y") for d in date_strs]
date_ints = set([d.toordinal() for d in dates])
if len(date_ints) == 1:
print "unique"
elif max(date_ints) - min(date_ints) == len(date_ints) - 1:
print "consecutive"
else:
print "not consecutive"
使用数据库按升序选择唯一日期: 如果查询返回单个日期,则这是您的第一个案例 否则,请确定日期是否连续:
import datetime
def consecutive(a, b, step=datetime.timedelta(days=1)):
return (a + step) == b
代码布局:
dates = <query database>
if all(consecutive(dates[i], dates[i+1]) for i in xrange(len(dates) - 1)):
if len(dates) == 1: # unique
# 1st case: all records have the same date
else:
# the dates are a range of dates
else:
# non-consecutive dates
这是我使用reduce函数的版本
from datetime import date, timedelta
def checked(d1, d2):
"""
We assume the date list is sorted.
If d2 & d1 are different by 1, everything up to d2 is consecutive, so d2
can advance to the next reduction.
If d2 & d1 are not different by 1, returning d1 - 1 for the next reduction
will guarantee the result produced by reduce() to be something other than
the last date in the sorted date list.
Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive
"""
#if (d2 - d1).days == 1 or (d2 - d1).days == 0: # for Definition 1
if (d2 - d1).days == 1: # for Definition 2
return d2
else:
return d1 + timedelta(days=-1)
# datelist = [date(2014, 1, 1), date(2014, 1, 3),
# date(2013, 12, 31), date(2013, 12, 30)]
# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
# date(2014, 2, 21), date(2014, 2, 22)]
datelist = [date(2014, 2, 19), date(2014, 2, 21),
date(2014, 2, 22), date(2014, 2, 20)]
datelist.sort()
if datelist[-1] == reduce(checked, datelist):
print "dates are consecutive"
else:
print "dates are not consecutive"
另一个版本使用与我的另一个答案相同的逻辑
from datetime import date, timedelta
# Definition 1: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider consecutive
# Definition 2: 1/1/14, 1/2/14, 1/2/14, 1/3/14 is consider not consecutive
# datelist = [date(2014, 1, 1), date(2014, 1, 3),
# date(2013, 12, 31), date(2013, 12, 30)]
# datelist = [date(2014, 2, 19), date(2014, 2, 19), date(2014, 2, 20),
# date(2014, 2, 21), date(2014, 2, 22)]
datelist = [date(2014, 2, 19), date(2014, 2, 21),
date(2014, 2, 22), date(2014, 2, 20)]
datelist.sort()
previousdate = datelist[0]
for i in range(1, len(datelist)):
#if (datelist[i] - previousdate).days == 1 or (datelist[i] - previousdate).days == 0: # for Definition 1
if (datelist[i] - previousdate).days == 1: # for Definition 2
previousdate = datelist[i]
else:
previousdate = previousdate + timedelta(days=-1)
if datelist[-1] == previousdate:
print "dates are consecutive"
else:
print "dates are not consecutive"
简短的惰性回复:将它们转换为datetime对象,对它们进行排序,然后使用itertools文档页面中的成对配方将所有日期与列表中的下一个日期进行比较,以查看是否在一个范围内;对于单一日期,取第一个日期,并检查所有其他日期是否在同一日历日内;如果这两个都失败了,那么它们是不同的日期。如果不选择其他值,请使用“按日期顺序从mytable中选择不同的日期\u观测值”,并且不要将日期转换为字符串。@deathApril:为什么降序?@J.F.Sebastian嗯,没有原因-我看到了“07-06-2010”,问题中的“06-24-2010”,并跳过了我猜剩下的示例。谢谢Michael。这与我的剧本配合得很好!我很感激你的回答。@Michael Dillon:选择不同的。。正如在@ DeaPaple的评论中更好的。如果我们认为‘01-01-2000’、‘01-01-2000’、‘01-02-2000’不是连续的,使用这个代码会说日期是连续的,你如何修改你的代码来考虑这个要求?我还不知道集合是如何工作的。@lessthanl0l:set删除重复项。如果你想考虑一个具有非连续性的重复列表,你可以测试看看你的日期列表的长度是否大于日期集的长度。