Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/279.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python列表按日期分组_Python_List_Date_Timestamp_Itertools - Fatal编程技术网

Python列表按日期分组

Python列表按日期分组,python,list,date,timestamp,itertools,Python,List,Date,Timestamp,Itertools,假设我有一个如下所示的列表: [(datetime.datetime(2013, 8, 8, 1, 20, 15), 2060), (datetime.datetime(2013, 8, 9, 1, 6, 14), 2055), (datetime.datetime(2013, 8, 9, 1, 21, 1), 2050), (datetime.datetime(2013, 8, 10, 1, 5, 49), 2050), (datetime.datetime(2013, 8, 10, 1, 1

假设我有一个如下所示的列表:

[(datetime.datetime(2013, 8, 8, 1, 20, 15), 2060), (datetime.datetime(2013, 8, 9, 1, 6, 14), 2055), (datetime.datetime(2013, 8, 9, 1, 21, 1), 2050), (datetime.datetime(2013, 8, 10, 1, 5, 49), 2050), (datetime.datetime(2013, 8, 10, 1, 19, 51), 2050), (datetime.datetime(2013, 8, 11, 2, 4, 53), 2050), (datetime.datetime(2013, 8, 12, 0, 29, 45), 2050), (datetime.datetime(2013, 8, 12, 0, 44, 13), 2050), (datetime.datetime(2013, 8, 13, 0, 34, 13), 2050), (datetime.datetime(2013, 8, 13, 0, 47, 29), 2050), (datetime.datetime(2013, 8, 14, 1, 30, 39), 2050), (datetime.datetime(2013, 8, 14, 1, 33, 51), 2050), (datetime.datetime(2013, 8, 15, 0, 41, 1), 2050), (datetime.datetime(2013, 8, 15, 0, 54, 45), 2050), (datetime.datetime(2013, 8, 16, 0, 29, 57), 1950), (datetime.datetime(2013, 8, 16, 0, 43, 11), 1950), (datetime.datetime(2013, 8, 17, 0, 27, 4), 1950), (datetime.datetime(2013, 8, 17, 0, 42, 30), 1950), (datetime.datetime(2013, 8, 18, 0, 26, 26), 1950), (datetime.datetime(2013, 8, 18, 0, 43, 11), 1950), (datetime.datetime(2013, 8, 19, 0, 41, 49), 1950), (datetime.datetime(2013, 8, 20, 1, 10, 23), 1950), (datetime.datetime(2013, 8, 20, 1, 23, 44), 1950), (datetime.datetime(2013, 8, 21, 0, 47, 25), 1950), (datetime.datetime(2013, 8, 21, 1, 0, 12), 1950), (datetime.datetime(2013, 8, 22, 0, 45, 21), 1950), (datetime.datetime(2013, 8, 22, 1, 4, 33), 1950), (datetime.datetime(2013, 8, 23, 0, 51, 27), 1950), (datetime.datetime(2013, 8, 23, 1, 6, 36), 1950), (datetime.datetime(2013, 8, 24, 0, 41, 3), 1950), (datetime.datetime(2013, 8, 24, 0, 53, 14), 1950), (datetime.datetime(2013, 8, 25, 0, 29, 24), 1950), (datetime.datetime(2013, 8, 25, 0, 42, 40), 1950), (datetime.datetime(2013, 8, 26, 0, 28, 13), 1950), (datetime.datetime(2013, 8, 26, 0, 43, 30), 1950), (datetime.datetime(2013, 8, 27, 0, 30, 1), 1950), (datetime.datetime(2013, 8, 27, 0, 43, 43), 1950), (datetime.datetime(2013, 8, 28, 0, 33, 19), 1950), (datetime.datetime(2013, 8, 28, 0, 49, 11), 1950), (datetime.datetime(2013, 8, 29, 0, 26, 49), 1950), (datetime.datetime(2013, 8, 29, 0, 41, 21), 1950), (datetime.datetime(2013, 8, 30, 0, 26, 13), 1950), (datetime.datetime(2013, 8, 30, 0, 42, 9), 1950), (datetime.datetime(2013, 8, 31, 0, 23, 40), 1950), (datetime.datetime(2013, 8, 31, 0, 39, 49), 1950), (datetime.datetime(2013, 9, 1, 0, 22, 2), 1950), (datetime.datetime(2013, 9, 1, 0, 38, 16), 1950), (datetime.datetime(2013, 9, 2, 0, 21, 2), 1950), (datetime.datetime(2013, 9, 2, 0, 36, 19), 1950), (datetime.datetime(2013, 9, 3, 0, 22, 16), 1950), (datetime.datetime(2013, 9, 3, 0, 39, 2), 1900)]
很明显,您可以看到这是一个元组列表,每个元组中的第一个元素是时间戳。格式良好,由以下人员生成:

datetime.strptime(record[0], timeFormat)
第二个元素是监控值。但是,每天可能有多个记录。例如,datetime.datetime(2013,8,9..)上有两条记录,它们有两个不同的值2055和2050。我想要的实际上是每天的最大值。 所以在这种情况下。2055年将是(2013年8月9日)的唯一记录

我想知道Python中是否有一种简便的方法可以做到这一点。类似于mysql的东西:

select 
    date(timestamp), 
    max(value)
from table 
group by date(timestamp)
mysql语句只是为了展示这个想法,我绝对想要一个python解决方案。

使用:


注意:假设记录已排序。如果没有,则应首先按日期对其进行排序。

您可以使用
集合。defaultdict
(这将适用于
O(N)
时间中已排序和未排序的数据):

正如@hughdbrown所提到的,更好的方法是:

>>> dic = {}
>>> for dt, val in lis:
...     dt = dt.date()
...     dic[dt] = max(dic.get(dt,0), val)
...     
>>> for k, v in dic.iteritems():
...     print k,v
...     
2013-08-20 1950
2013-08-15 2050
2013-08-22 1950
2013-08-09 2055
2013-08-16 1950
2013-08-11 2050
2013-08-18 1950
2013-09-03 1950
2013-09-01 1950
...

为什么不
d={};对于dt,lis中的val:k=dt.date();d[k]=max(d.get(k,0),val)
不需要
defaultdict
。不需要列出值。直接计算
max
>>> from collections import defaultdict
>>> lis = [(datetime.datetime(2013, 8, 8, 1, 20, 15), 2060), (datetime.datetime(2013, 8, 9, 1, 6, 14), 2055), (datetime.datetime(2013, 8, 9, 1, 21, 1), 2050), (datetime.datetime(2013, 8, 10, 1, 5, 49), 2050), (datetime.datetime(2013, 8, 10, 1, 19, 51), 2050), (datetime.datetime(2013, 8, 11, 2, 4, 53), 2050), (datetime.datetime(2013, 8, 12, 0, 29, 45), 2050), (datetime.datetime(2013, 8, 12, 0, 44, 13), 2050), (datetime.datetime(2013, 8, 13, 0, 34, 13), 2050), (datetime.datetime(2013, 8, 13, 0, 47, 29), 2050), (datetime.datetime(2013, 8, 14, 1, 30, 39), 2050), (datetime.datetime(2013, 8, 14, 1, 33, 51), 2050), (datetime.datetime(2013, 8, 15, 0, 41, 1), 2050), (datetime.datetime(2013, 8, 15, 0, 54, 45), 2050), (datetime.datetime(2013, 8, 16, 0, 29, 57), 1950), (datetime.datetime(2013, 8, 16, 0, 43, 11), 1950), (datetime.datetime(2013, 8, 17, 0, 27, 4), 1950), (datetime.datetime(2013, 8, 17, 0, 42, 30), 1950), (datetime.datetime(2013, 8, 18, 0, 26, 26), 1950), (datetime.datetime(2013, 8, 18, 0, 43, 11), 1950), (datetime.datetime(2013, 8, 19, 0, 41, 49), 1950), (datetime.datetime(2013, 8, 20, 1, 10, 23), 1950), (datetime.datetime(2013, 8, 20, 1, 23, 44), 1950), (datetime.datetime(2013, 8, 21, 0, 47, 25), 1950), (datetime.datetime(2013, 8, 21, 1, 0, 12), 1950), (datetime.datetime(2013, 8, 22, 0, 45, 21), 1950), (datetime.datetime(2013, 8, 22, 1, 4, 33), 1950), (datetime.datetime(2013, 8, 23, 0, 51, 27), 1950), (datetime.datetime(2013, 8, 23, 1, 6, 36), 1950), (datetime.datetime(2013, 8, 24, 0, 41, 3), 1950), (datetime.datetime(2013, 8, 24, 0, 53, 14), 1950), (datetime.datetime(2013, 8, 25, 0, 29, 24), 1950), (datetime.datetime(2013, 8, 25, 0, 42, 40), 1950), (datetime.datetime(2013, 8, 26, 0, 28, 13), 1950), (datetime.datetime(2013, 8, 26, 0, 43, 30), 1950), (datetime.datetime(2013, 8, 27, 0, 30, 1), 1950), (datetime.datetime(2013, 8, 27, 0, 43, 43), 1950), (datetime.datetime(2013, 8, 28, 0, 33, 19), 1950), (datetime.datetime(2013, 8, 28, 0, 49, 11), 1950), (datetime.datetime(2013, 8, 29, 0, 26, 49), 1950), (datetime.datetime(2013, 8, 29, 0, 41, 21), 1950), (datetime.datetime(2013, 8, 30, 0, 26, 13), 1950), (datetime.datetime(2013, 8, 30, 0, 42, 9), 1950), (datetime.datetime(2013, 8, 31, 0, 23, 40), 1950), (datetime.datetime(2013, 8, 31, 0, 39, 49), 1950), (datetime.datetime(2013, 9, 1, 0, 22, 2), 1950), (datetime.datetime(2013, 9, 1, 0, 38, 16), 1950), (datetime.datetime(2013, 9, 2, 0, 21, 2), 1950), (datetime.datetime(2013, 9, 2, 0, 36, 19), 1950), (datetime.datetime(2013, 9, 3, 0, 22, 16), 1950), (datetime.datetime(2013, 9, 3, 0, 39, 2), 1900)]
>>> dic = defaultdict(list)
for dt, val in lis:
    dic[dt.date()].append(val)
...     
>>> for k, v in dic.iteritems():
    print k, max(v)
...     
2013-08-20 1950
2013-08-15 2050
2013-08-22 1950
2013-08-09 2055
2013-08-16 1950
2013-08-11 2050
2013-08-18 1950
2013-09-03 1950
2013-09-01 1950
...
>>> dic = {}
>>> for dt, val in lis:
...     dt = dt.date()
...     dic[dt] = max(dic.get(dt,0), val)
...     
>>> for k, v in dic.iteritems():
...     print k,v
...     
2013-08-20 1950
2013-08-15 2050
2013-08-22 1950
2013-08-09 2055
2013-08-16 1950
2013-08-11 2050
2013-08-18 1950
2013-09-03 1950
2013-09-01 1950
...