Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/wordpress/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:获取列表中最频繁的项_Python_List_Group By_Max - Fatal编程技术网

Python:获取列表中最频繁的项

Python:获取列表中最频繁的项,python,list,group-by,max,Python,List,Group By,Max,给定一个元组列表,我希望得到最频繁出现的元组,但是如果有“联合赢家”,它应该在它们之间随机选择 tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ] 上述列表应随机返回(1,2)或(3,4),对于上述列表您可以首先对列表进行排序,以获得按频率排序的元组。之后,线性扫描可以从列表中获取最频繁的元组。总时间O(nlogn) 使用集合。计数器: >>> collections.Counter([ (1,2), (3,4), (5,6), (1,2

给定一个元组列表,我希望得到最频繁出现的元组,但是如果有“联合赢家”,它应该在它们之间随机选择

tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]

上述列表应随机返回
(1,2)
(3,4)
,对于上述列表

您可以首先对列表进行排序,以获得按频率排序的元组。之后,线性扫描可以从列表中获取最频繁的元组。总时间
O(nlogn)


使用集合。计数器:

>>> collections.Counter([ (1,2), (3,4), (5,6), (1,2), (3,4) ]).most_common()[0]
((1, 2), 2)

这是
O(n log(n))

您可以首先使用计数器查找重复次数最多的元组。然后找到所需的元组,最后随机化并得到第一个值

from collections import Counter
import random

tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]
lst = Counter(tups).most_common()
highest_count = max([i[1] for i in lst])
values = [i[0] for i in lst if i[1] == highest_count]
random.shuffle(values)
print values[0]

这一个应该在
o(n)
时间内完成您的任务:

>>> from random import shuffle
>>> from collections import Counter
>>>
>>> tups = [(1,2), (3,4), (5,6), (1,2), (3,4)]
>>> c = Counter(tups)                            # count frequencies
>>> m = max(v for _, v in c.iteritems())         # get max frq
>>> r = [k for k, v in c.iteritems() if v == m]  # all items with highest frq
>>> shuffle(r)                                   # if you really need random - shuffle
>>> print r[0]
(3, 4)

使用
集合计数。计数器
然后随机选择最常见的:

import collections
import random

lis = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]  # Test data
cmn = collections.Counter(lis).most_common()  # Numbering based on occurrence
most = [e for e in cmn if (e[1] == cmn[0][1])]  # List of those most common
print(random.choice(most)[0])  # Print one of the most common at random

下面是另一个没有导入的示例:

listAlphaLtrs = ['b','a','a','b','a','c','a','a','b','c','c','b','a','a','a']
dictFoundLtrs = {i:listAlphaLtrs.count(i) for i in listAlphaLtrs}
maxcnt = 0
theltr = 0
for ltr in dictFoundLtrs:
    ltrfound = ltr
    foundcnt = dictFoundLtrs[ltr]
    if foundcnt > maxcnt:
        maxcnt = foundcnt
        theltr = ltrfound
print('most: ' + theltr)
来源:


实际上,如果您需要最常用的项目,可以在O(n)中执行。您只需计算频率,获得最大值,然后获得所有频率最高的项目。至少有4个人不知道:)在O(n)中不可能做到这一点,单独计算所有频率需要某种哈希表。AFAIK计数器作为字典实现,字典的设置元素平均为
O(1)
。在您的回答中,
most_common()
方法将对所有元素进行排序,因此它将是最重的部分-O(n log(n))@Juddling实际上,您不需要most_common()-请参阅我的answer@RomanPekar我明白了,为什么我要避免使用most_common()?如果我没弄错的话,most_common()将对整个列表进行排序,您只需要最大频率这不是
O(n)
,请考虑所有元素都不可用的情况。字典中元素的设置为O(1)平均值
listAlphaLtrs = ['b','a','a','b','a','c','a','a','b','c','c','b','a','a','a']
dictFoundLtrs = {i:listAlphaLtrs.count(i) for i in listAlphaLtrs}
maxcnt = 0
theltr = 0
for ltr in dictFoundLtrs:
    ltrfound = ltr
    foundcnt = dictFoundLtrs[ltr]
    if foundcnt > maxcnt:
        maxcnt = foundcnt
        theltr = ltrfound
print('most: ' + theltr)