从单个列表创建dicts字典-Python3_Python_Python 3.x_Dictionary

从单个列表创建dicts字典-Python3

python python-3.x dictionary

从单个列表创建dicts字典-Python3,python,python-3.x,dictionary,Python,Python 3.x,Dictionary,Linux上的Python 3.6.5/3.7.1 正在努力创建一个以词典作为值的词典我想从列表日期和时间数据创建一个字典（最终使用bokeh创建图表）这肯定是以前问过的，但我找不到一组搜索词，它返回的结果可以为我澄清问题注意，我基本上是一个爱好编码的人&我不容易像一个真正的程序员那样从算法上思考数据在列表中（最多3200项）：每个项目都是一小时时钟周期内某一日期事件发生的记录因此['03/01/19 09:00'，'03/01/19 09:00'，'03/01/19 09:00'，

Linux上的Python 3.6.5/3.7.1

正在努力创建一个以词典作为值的词典

我想从列表日期和时间数据创建一个字典（最终使用bokeh创建图表）

这肯定是以前问过的，但我找不到一组搜索词，它返回的结果可以为我澄清问题

注意，我基本上是一个爱好编码的人&我不容易像一个真正的程序员那样从算法上思考

数据在列表中（最多3200项）：每个项目都是一小时时钟周期内某一日期事件发生的记录

因此<代码>['03/01/19 09:00'，'03/01/19 09:00'，'03/01/19 09:00'，]表示2019年1月3日0900-1000之间发生的3起事件

只记录带有事件的时钟周期，因此如果没有事件，则没有时间戳

nb日期格式为

ddmmyy

示例数据：

dtl = [
    '06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', '05/01/19 21:00',
    '05/01/19 17:00', '05/01/19 17:00', '05/01/19 14:00', '03/01/19 21:00',
    '03/01/19 17:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00',
    '03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', '03/01/19 10:00',
    '03/01/19 10:00', '03/01/19 09:00','03/01/19 09:00','03/01/19 09:00',
]

所需的字典如下所示：

dtd = {
    '03/01/19': {
         '00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0,
         '06': 0, '07': 0, '08': 0, '09': 3, '10': 2, '11': 1,
         '12': 5, '13': 0, '14': 0, '15': 0, '16': 0, '17': 1,
         '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0,
     },
     '04/01/19': {
         '00': 0, ... '23': 0
     },
     '05/01/19': {
         '00': 0, ... 
     } ... etc
}

显然，我至少可以用以下键初始化词典：

{i.split()[0]:{} for i in dtl}

但是，我无法了解如何使用计数更新子目录，因此无法找到从原始列表到所需词典的方法。我在兜圈子

一旦您按日期拆分成一本词典，您就可以将a与a结合起来非常有效地执行此操作。因此，首先按日期划分：

from collections import Counter, defaultdict

dtd = defaultdict(list)
for date, time in (item.split() for item in dtl):
    dtd[date].append(time[:2])

现在，您可以轻松地计算现有项目，并使用它们初始化

defaultdict

，该默认dict将为缺失的时间返回零：

for key in dtd:
    dtd[key] = defaultdict(int, Counter(dtd[key]))

结果是：

defaultdict(list, {
    '03/01/19': defaultdict(int, {
        '09': 3,
        '10': 2,
        '11': 1,
        '12': 5,
        '17': 1,
        '21': 1
    }),
    '05/01/19': defaultdict(int, {'14': 1, '17': 2, '21': 1}),
    '06/01/19': defaultdict(int, {'11': 1, '12': 2})
})

因为这里的对象是

defaultdict

s，所以您可以查询原始数据集中没有的日期和时间。您可以通过将结果转换为常规的

dict

来避免这种情况，该dict仅包含完成后所需的键：

hours = ['%02d' % h for h in range(24)]
dtd = {date: {h: d[h] for h in hours} for date, d in dtd}

一旦按日期拆分成一本词典，您就可以将a与a结合起来非常有效地执行此操作。因此，首先按日期划分：

from collections import Counter, defaultdict

dtd = defaultdict(list)
for date, time in (item.split() for item in dtl):
    dtd[date].append(time[:2])

现在，您可以轻松地计算现有项目，并使用它们初始化

defaultdict

，该默认dict将为缺失的时间返回零：

for key in dtd:
    dtd[key] = defaultdict(int, Counter(dtd[key]))

结果是：

defaultdict(list, {
    '03/01/19': defaultdict(int, {
        '09': 3,
        '10': 2,
        '11': 1,
        '12': 5,
        '17': 1,
        '21': 1
    }),
    '05/01/19': defaultdict(int, {'14': 1, '17': 2, '21': 1}),
    '06/01/19': defaultdict(int, {'11': 1, '12': 2})
})

因为这里的对象是

defaultdict

s，所以您可以查询原始数据集中没有的日期和时间。您可以通过将结果转换为常规的

dict

来避免这种情况，该dict仅包含完成后所需的键：

hours = ['%02d' % h for h in range(24)]
dtd = {date: {h: d[h] for h in hours} for date, d in dtd}

我建议使用

collections.defaultdict

，因为有些计数可以是0

这里有一个选项：

from collections import defaultdict

dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', 
       '05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00', 
       '05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00',
       '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', 
       '03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', 
       '03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00',
       '03/01/19 09:00','03/01/19 09:00',]

# Nested defaultdict
result = defaultdict(lambda: defaultdict(int))

for date_time in dtl:
    date, time = date_time.split()
    result[date][time.split(':')[0]] += 1

输出（使用

pprint

）：

输出如下：

{'06/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 1, '12': 2, '13': 0, '14': 0, '15': 0, '16': 0, '17': 0, '18': 0, '19': 0, '20': 0, '21': 0, '22': 0, '23': 0}, '05/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 0, '12': 0, '13': 0, '14': 1, '15': 0, '16': 0, '17': 2, '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0}, '03/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 3, '10': 2, '11': 1, '12': 5, '13': 0, '14': 0, '15': 0, '16': 0, '17': 1, '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0}}

我建议使用

collections.defaultdict

，因为有些计数可以是0

这里有一个选项：

from collections import defaultdict

dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', 
       '05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00', 
       '05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00',
       '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', 
       '03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', 
       '03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00',
       '03/01/19 09:00','03/01/19 09:00',]

# Nested defaultdict
result = defaultdict(lambda: defaultdict(int))

for date_time in dtl:
    date, time = date_time.split()
    result[date][time.split(':')[0]] += 1

输出（使用

pprint

）：

输出如下：

{'06/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 1, '12': 2, '13': 0, '14': 0, '15': 0, '16': 0, '17': 0, '18': 0, '19': 0, '20': 0, '21': 0, '22': 0, '23': 0}, '05/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 0, '12': 0, '13': 0, '14': 1, '15': 0, '16': 0, '17': 2, '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0}, '03/01/19': {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0, '06': 0, '07': 0, '08': 0, '09': 3, '10': 2, '11': 1, '12': 5, '13': 0, '14': 0, '15': 0, '16': 0, '17': 1, '18': 0, '19': 0, '20': 0, '21': 1, '22': 0, '23': 0}}

一种快速而肮脏的方式是：

#!/usr/bin/env python3

def convert(dt):
    ret = {}
    for elem in dt:
        d,t = elem.split()
        t = t.split(":")[0]
        # not a valid value
        if not d: pass

        # we inserted d already
        if d in ret:
            if t in ret[d]:
                ret[d][t] += 1
        else:
            ret[d] = {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0,
                    '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 0, 
                    '12': 0, '13': 0, '14': 0, '15': 0, '16': 0, '17': 0, 
                    '18': 0, '19': 0, '20': 0, '21': 0, '22': 0, '23': 0 }
    return ret

dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', '05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00', '05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00','03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', '03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00','03/01/19 09:00','03/01/19 09:00']

print(convert(dtl))

一种快速而肮脏的方式是：

#!/usr/bin/env python3

def convert(dt):
    ret = {}
    for elem in dt:
        d,t = elem.split()
        t = t.split(":")[0]
        # not a valid value
        if not d: pass

        # we inserted d already
        if d in ret:
            if t in ret[d]:
                ret[d][t] += 1
        else:
            ret[d] = {'00': 0, '01': 0, '02': 0, '03': 0, '04': 0, '05': 0,
                    '06': 0, '07': 0, '08': 0, '09': 0, '10': 0, '11': 0, 
                    '12': 0, '13': 0, '14': 0, '15': 0, '16': 0, '17': 0, 
                    '18': 0, '19': 0, '20': 0, '21': 0, '22': 0, '23': 0 }
    return ret

dtl = ['06/01/19 12:00', '06/01/19 12:00', '06/01/19 11:00', '05/01/19 21:00', '05/01/19 17:00', '05/01/19 17:00', '05/01/19 14:00', '03/01/19 21:00', '03/01/19 17:00','03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 12:00', '03/01/19 11:00', '03/01/19 10:00', '03/01/19 10:00', '03/01/19 09:00','03/01/19 09:00','03/01/19 09:00']

print(convert(dtl))

只是为了确认所有时间都在一小时之内？不会看到任何

09:30

？@hqkhan更正以确认所有时间都在一小时内吗？不会看到任何

09:30

？@hqkhan correct

ret[d]={}

。您是否忘记插入第一次并添加1？类似地，在

if

部分中，您应该从1开始，而不是0。是的，您是对的，格式也不正确。

ret[d]={}

。您是否忘记插入第一次并添加1？类似地，在

if

部分，你应该从1开始，而不是0。是的，你是对的，格式也不正确。这个答案和@mad physical的答案都切中要害，非常有用，但这一条对我来说有优势，因为它规定了在最后一条中填充“空”值。这个答案和@mad Physicator的答案都切中要害，非常有用，但这一条对我来说有优势，因为它规定了在最后一条中填充“空”值。有点被这句话搞糊涂了“使用它们初始化defaultdict，它将为丢失的时间返回零：“与示例输出一样，未显示“丢失的小时数”；但我可能误解了意图。@coderedded的意思是，这与

defaultdict

的工作方式有关。如果您尝试的

键

实际上不在

dict

中，它将初始化您的值。例如，如果

test={1:20，2:25}

。假设

test

是

defaultdict

，您尝试发出

test[15]

，即使

在

test

中不存在，您将返回

而不是

keyrerror

。使用

.get（key，default\u value）

也可以实现同样的行为，但

defaultdict

会让事情看起来很整洁。@hqkhan。

defaultdict

优于

get

或

setdefault

的优点是

defaultdict

仅在缺少密钥时调用传入的构造函数。使用

dict

方法，在调用方法之前，每次都要调用构造函数，这通常会产生无害但不必要的开销。@madnice。“很高兴知道。”代码编辑。defaultdict只显示您设置的键，但它允许您查询任何键

dtd['blah']

将返回

[]

，而

dtd[03/01/19]['blah']

将返回0。由于您不希望在任意时间和日期发生这种情况是有道理的，因此我在回答中添加了一个片段，以快速转换为普通嵌套字典。对于“使用它们初始化将为丢失的时间返回零的defaultdict”这句话，我有点困惑：“在示例输出中，没有显示“丢失的小时数”；但我可能误解了意图。@coderedded的意思是，这与

defaultdict

的工作方式有关。如果您尝试的

键

实际上不在

dict

中，它将初始化您的值。例如，如果

test={1:20，2:25}

。假设

test

是

defaultdict

，您尝试了ISU