Python 是否有一种更为通俗的方式来编写以下最大值函数:

Python 是否有一种更为通俗的方式来编写以下最大值函数:,python,Python,MapReduce有人吗?您可以这样计算最大值: def greatest(values): value_generator = (v for k,v in values) max_value = max(value_generator) return (k for k,v in values if v == max_value) sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) ) items = list( gre

MapReduce有人吗?

您可以这样计算最大值:

def greatest(values):
    value_generator = (v for k,v in values)
    max_value = max(value_generator)
    return (k for k,v in values if v == max_value)

sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']
from operator import itemgetter

def greatest(values):
    m = max(values, key=itemgetter(1))[1]
    return [k for k,v in values if v == m]
>>> def greatest_max_iter(values):
...     max_value = max((v for k, v in values))
...     return (k for k, v in values if v == max_value)
...                                                
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop
如注释中所述,您还可以将itemgetter用于max()函数键:

max_value = max(sample_data, key=lambda x: x[1])[1]
因此,您的代码是(使用itemgetter并直接返回列表):

内置函数有一个可选的
参数,可以自定义排序数据。下面,它对数据元组中的第二项进行排序,并返回最大值:

import operator 
def greatest(values):
    max_value = max(values, key=operator.itemgetter(1))[1]
    return [k for k,v in values if v == max_value]
试试这个:

>>> sample_data = ('id1',3),('id2',5),('id3',5)
>>> def greatest(values):
...   m = max(values,key=lambda n: n[1])[1]
...   return [k for k,v in values if v==m]
...
>>> greatest(sample_data)
['id2', 'id3']
然后像这样使用它:

def greatest(values):
    value_generator = (v for k,v in values)
    max_value = max(value_generator)
    return (k for k,v in values if v == max_value)

sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']
from operator import itemgetter

def greatest(values):
    m = max(values, key=itemgetter(1))[1]
    return [k for k,v in values if v == m]
>>> def greatest_max_iter(values):
...     max_value = max((v for k, v in values))
...     return (k for k, v in values if v == max_value)
...                                                
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop

如果要使用
map

>>> sample_data = (('id1', 3), ('id2', 5), ('id3', 5))
>>> greatest(sample_data)
['id2', 'id3']

事实上,根据我的测试,您的
best
版本更快——无论如何,要快一点:

>>> sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
>>> max_value = max(sample_data, key=lambda x: x[1])
>>> map(lambda x: x[0], filter((lambda x: x[1]==max_value), sample_data))
['id2', 'id3']
当然,如果您不喜欢将生成器分配给名称,您可以直接将生成器传递给max——比
max(values,key=itemgetter(1))[1]
,IMHO:

>>> def greatest_orig(values):
...     value_generator = (v for k,v in values)
...     max_value = max(value_generator)
...     return (k for k,v in values if v == max_value)
... 
>>> def greatest_max_key(values):
...     max_value = max(values, key=itemgetter(1))[1]
...     return (k for k,v in values if v == max_value)
... 
>>> sample_data = tuple(('id' + str(i), random.randrange(0, 1000)) for i in range(10000))
>>> list(greatest_orig(sample_data)) == list(greatest_max_key(sample_data))
True
>>> %timeit list(greatest_orig(sample_data))
1000 loops, best of 3: 1.67 ms per loop
>>> %timeit list(greatest_max_key(sample_data))
1000 loops, best of 3: 1.74 ms per loop
Python允许您在执行以下操作时省略外部参数:

def greatest(values):
    value_generator = (v for k,v in values)
    max_value = max(value_generator)
    return (k for k,v in values if v == max_value)

sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']
from operator import itemgetter

def greatest(values):
    m = max(values, key=itemgetter(1))[1]
    return [k for k,v in values if v == m]
>>> def greatest_max_iter(values):
...     max_value = max((v for k, v in values))
...     return (k for k, v in values if v == max_value)
...                                                
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop
但出于我不理解的原因,这样做会稍微慢一些:

>>> def greatest_max_iter(values):
...     max_value = max(v for k, v in values)
...     return (k for k, v in values if v == max_value)
... 

这些都是真正的微观优化,不太重要。但我认为可读性更倾向于
max(v代表k,v代表值)
max((v代表k,v代表值))
而不是
max(values,key=itemgetter(1))[1]
运算符。itemgetter(1)
是一种等效的、速度更快的拼写方法
lambda n:n[1]
。这只返回一个值。@Johnsyweb,对,这只是用来代替OP max_值的计算。编辑以防止误解+1用于nice功能练习。但是,不确定是否更像蟒蛇。