Python 基于值的字典组/计数列表

Python 基于值的字典组/计数列表,python,list,aggregate-functions,Python,List,Aggregate Functions,我有一个令牌列表,看起来像: [{ Value: "Blah", StartOffset: 0, EndOffset: 4 }, ... ] >>> from itertools import groupby >>> def keyfn(x): return x['Value'] ... >>> [(k, list(g)) for k,g in groupby(sorted(tokens, key=

我有一个令牌列表,看起来像:

[{
    Value: "Blah",
    StartOffset: 0,
    EndOffset: 4
}, ... ]
>>> from itertools import groupby
>>> def keyfn(x):
        return x['Value']
... 
>>> [(k, list(g)) for k,g in groupby(sorted(tokens, key=keyfn), keyfn)]
[('Blah', [{'SO': 0, 'Value': 'Blah'}, {'SO': 2, 'Value': 'Blah'}, {'SO': 3, 'Value': 'Blah'}]), ('zoom', [{'SO': 5, 'Value': 'zoom'}])]
我想做的是计算每个值在令牌列表中出现的次数

在VB.Net中,我会做一些类似于

Tokens = Tokens.
GroupBy(Function(x) x.Value).
Select(Function(g) New With {
           .Value = g.Key,
           .Count = g.Count})

Python中的等价物是什么?

IIUC,您可以使用
集合。计数器

>>> from collections import Counter
>>> tokens = [{"Value": "Blah", "SO": 0}, {"Value": "zoom", "SO": 5}, {"Value": "Blah", "SO": 2}, {"Value": "Blah", "SO": 3}]
>>> Counter(tok['Value'] for tok in tokens)
Counter({'Blah': 3, 'zoom': 1})
如果你只需要数一数。如果要按值对它们进行分组,可以使用
itertools.groupby
和类似以下内容:

[{
    Value: "Blah",
    StartOffset: 0,
    EndOffset: 4
}, ... ]
>>> from itertools import groupby
>>> def keyfn(x):
        return x['Value']
... 
>>> [(k, list(g)) for k,g in groupby(sorted(tokens, key=keyfn), keyfn)]
[('Blah', [{'SO': 0, 'Value': 'Blah'}, {'SO': 2, 'Value': 'Blah'}, {'SO': 3, 'Value': 'Blah'}]), ('zoom', [{'SO': 5, 'Value': 'zoom'}])]
尽管这有点棘手,因为
groupby
要求分组的术语是连续的,因此必须首先按键排序

import collections

# example token list
tokens = [{'Value':'Blah', 'Start':0}, {'Value':'BlahBlah'}]

count=collections.Counter([d['Value'] for d in tokens])
print count
显示


假设这是您的python列表,其中包含词汇:

my_list = [{'Value': 'Blah',
            'StartOffset': 0,
            'EndOffset': 4},
           {'Value': 'oqwij',
            'StartOffset': 13,
            'EndOffset': 98},
           {'Value': 'Blah',
            'StartOffset': 6,
            'EndOffset': 18}]
一行:

len([i for i in a if i['Value'] == 'Blah']) # returns 2

谢谢,这正是我想要的。我仍然在思考用蟒蛇式的方法做事