Python 集合。计数器:最常见,包括相等计数

Python 集合。计数器:最常见,包括相等计数,python,collections,python-collections,Python,Collections,Python Collections,在collections.Counter中,方法most_common(n)仅返回列表中最频繁的n个项目。我需要的正是这些,但我也需要包括相等的计数 from collections import Counter test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"]) -->Counter({'A': 3, 'C': 2, 'B': 2, 'D': 2, 'E': 1, 'G': 1, 'F': 1, 'H

collections.Counter
中,方法
most_common(n)
仅返回列表中最频繁的n个项目。我需要的正是这些,但我也需要包括相等的计数

from collections import Counter
test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"])
-->Counter({'A': 3, 'C': 2, 'B': 2, 'D': 2, 'E': 1, 'G': 1, 'F': 1, 'H': 1})
test.most_common(2)
-->[('A', 3), ('C', 2)
我需要
[('A',3),('B',2),('C',2),('D',2)]

因为在这种情况下,它们的计数与n=2相同。我的真实数据是DNA代码,可能相当大。我需要它有点效率。

您可以这样做:

from itertools import takewhile

def get_items_upto_count(dct, n):
  data = dct.most_common()
  val = data[n-1][1] #get the value of n-1th item
  #Now collect all items whose value is greater than or equal to `val`.
  return list(takewhile(lambda x: x[1] >= val, data))

test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"])

print get_items_upto_count(test, 2)
#[('A', 3), ('C', 2), ('B', 2), ('D', 2)]

对于较小的集合,只需编写一个简单的生成器:

>>> test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"])
>>> g=(e for e in test.most_common() if e[1]>=2)
>>> list(g)
[('A', 3), ('D', 2), ('C', 2), ('B', 2)]
def fc(d, f):
    for t in d.most_common():
        if not f(t[1]): 
            break
        yield t

>>> list(fc(test, lambda e: e>=2)) 
[('A', 3), ('B', 2), ('C', 2), ('D', 2)]
对于较大的集合,请使用(或仅在Python 3上使用
filter
):

或者,由于大多数常见的都已订购,只需在发电机中的所需条件下使用for循环和break:

>>> test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"])
>>> g=(e for e in test.most_common() if e[1]>=2)
>>> list(g)
[('A', 3), ('D', 2), ('C', 2), ('B', 2)]
def fc(d, f):
    for t in d.most_common():
        if not f(t[1]): 
            break
        yield t

>>> list(fc(test, lambda e: e>=2)) 
[('A', 3), ('B', 2), ('C', 2), ('D', 2)]

您有一个输入错误,请更正,它应该是
get\u item\u upto\u count(test,2)
为什么不只取n个元素的一个片段,而下一个元素的值等于n个片段的最后一个元素?@padraiccnningham取什么片段?