Python 如何使用dict计算列表的大小？_Python_List_Python 2.7_Dictionary

Python 如何使用dict计算列表的大小？

python list python-2.7 dictionary

Python 如何使用dict计算列表的大小？,python,list,python-2.7,dictionary,Python,List,Python 2.7,Dictionary,如果我有一个dict列表，比如： { 'id1': ['a', 'b', 'c'], 'id2': ['a', 'b'], # etc. } 我想统计列表的大小，即ID的数量>0，>1，>2…等等对于这样的循环，是否有比嵌套更简单的方法： dictOfOutputs = {} for x in range(1,11): count = 0 for agentId in userIdDict: if len(userIdDict[agen

如果我有一个

dict

列表，比如：

{
    'id1': ['a', 'b', 'c'],
    'id2': ['a', 'b'],
    # etc.
}

我想统计列表的大小，即ID的数量>0，>1，>2…等等

对于这样的循环，是否有比嵌套更简单的方法：

dictOfOutputs = {}
for x in range(1,11):
    count = 0
    for agentId in userIdDict:
        if len(userIdDict[agentId]) > x:
            count += 1
    dictOfOutputs[x] = count        
return dictOfOutputs

我会使用a来收集长度，然后累积总和：

from collections import Counter

lengths = Counter(len(v) for v in userIdDict.values())
total = 0
accumulated = {}
for length in range(max(lengths), -1, -1):
    count = lengths.get(length, 0)
    total += count
    accumulated[length] = total

因此，它收集每个长度的计数，然后构建一个具有累积长度的字典。这是一个O（N）算法；将所有值循环一次，然后添加一些较小的直循环（对于

max（）

和累加循环）：

是的，有更好的办法

首先，根据ID的数据长度对其进行索引：

my_dict = {
    'id1': ['a', 'b', 'c'],
    'id2': ['a', 'b'],
}

from collections import defaultdict
ids_by_data_len = defaultdict(list)

for id, data in my_dict.items():
    my_dict[len(data)].append(id)

现在，创建您的dict：

output_dict = {}
accumulator = 0
# note: the end of a range is non-inclusive!
for data_len in reversed(range(1, max(ids_by_data_len.keys()) + 1):
    accumulator += len(ids_by_data_len.get(data_len, []))
    output_dict[data_len-1] = accumulator

这具有O（n）复杂度，而不是O（n²），因此对于大型数据集，它的速度也要快得多

output_dict = {}
accumulator = 0
# note: the end of a range is non-inclusive!
for data_len in reversed(range(1, max(ids_by_data_len.keys()) + 1):
    accumulator += len(ids_by_data_len.get(data_len, []))
    output_dict[data_len-1] = accumulator