Python 是否有一种更为通俗的方式来编写以下最大值函数:
MapReduce有人吗?您可以这样计算最大值:Python 是否有一种更为通俗的方式来编写以下最大值函数:,python,Python,MapReduce有人吗?您可以这样计算最大值: def greatest(values): value_generator = (v for k,v in values) max_value = max(value_generator) return (k for k,v in values if v == max_value) sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) ) items = list( gre
def greatest(values):
value_generator = (v for k,v in values)
max_value = max(value_generator)
return (k for k,v in values if v == max_value)
sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']
from operator import itemgetter
def greatest(values):
m = max(values, key=itemgetter(1))[1]
return [k for k,v in values if v == m]
>>> def greatest_max_iter(values):
... max_value = max((v for k, v in values))
... return (k for k, v in values if v == max_value)
...
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop
如注释中所述,您还可以将itemgetter用于max()函数键:
max_value = max(sample_data, key=lambda x: x[1])[1]
因此,您的代码是(使用itemgetter并直接返回列表):
内置函数有一个可选的键
参数,可以自定义排序数据。下面,它对数据元组中的第二项进行排序,并返回最大值:
import operator
def greatest(values):
max_value = max(values, key=operator.itemgetter(1))[1]
return [k for k,v in values if v == max_value]
试试这个:
>>> sample_data = ('id1',3),('id2',5),('id3',5)
>>> def greatest(values):
... m = max(values,key=lambda n: n[1])[1]
... return [k for k,v in values if v==m]
...
>>> greatest(sample_data)
['id2', 'id3']
然后像这样使用它:
def greatest(values):
value_generator = (v for k,v in values)
max_value = max(value_generator)
return (k for k,v in values if v == max_value)
sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']
from operator import itemgetter
def greatest(values):
m = max(values, key=itemgetter(1))[1]
return [k for k,v in values if v == m]
>>> def greatest_max_iter(values):
... max_value = max((v for k, v in values))
... return (k for k, v in values if v == max_value)
...
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop
如果要使用
map
>>> sample_data = (('id1', 3), ('id2', 5), ('id3', 5))
>>> greatest(sample_data)
['id2', 'id3']
事实上,根据我的测试,您的
best
版本更快——无论如何,要快一点:
>>> sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
>>> max_value = max(sample_data, key=lambda x: x[1])
>>> map(lambda x: x[0], filter((lambda x: x[1]==max_value), sample_data))
['id2', 'id3']
当然,如果您不喜欢将生成器分配给名称,您可以直接将生成器传递给max——比max(values,key=itemgetter(1))[1]
,IMHO:
>>> def greatest_orig(values):
... value_generator = (v for k,v in values)
... max_value = max(value_generator)
... return (k for k,v in values if v == max_value)
...
>>> def greatest_max_key(values):
... max_value = max(values, key=itemgetter(1))[1]
... return (k for k,v in values if v == max_value)
...
>>> sample_data = tuple(('id' + str(i), random.randrange(0, 1000)) for i in range(10000))
>>> list(greatest_orig(sample_data)) == list(greatest_max_key(sample_data))
True
>>> %timeit list(greatest_orig(sample_data))
1000 loops, best of 3: 1.67 ms per loop
>>> %timeit list(greatest_max_key(sample_data))
1000 loops, best of 3: 1.74 ms per loop
Python允许您在执行以下操作时省略外部参数:
def greatest(values):
value_generator = (v for k,v in values)
max_value = max(value_generator)
return (k for k,v in values if v == max_value)
sample_data = ( ('id1', 3), ('id2', 5), ('id3', 5) )
items = list( greatest(sample_data) ) # Should produce ['id2', 'id3']
from operator import itemgetter
def greatest(values):
m = max(values, key=itemgetter(1))[1]
return [k for k,v in values if v == m]
>>> def greatest_max_iter(values):
... max_value = max((v for k, v in values))
... return (k for k, v in values if v == max_value)
...
>>> list(greatest_orig(sample_data)) == list(greatest_max_iter(sample_data))
True
>>> %timeit list(greatest_max_iter(sample_data))
1000 loops, best of 3: 1.67 ms per loop
但出于我不理解的原因,这样做会稍微慢一些:
>>> def greatest_max_iter(values):
... max_value = max(v for k, v in values)
... return (k for k, v in values if v == max_value)
...
这些都是真正的微观优化,不太重要。但我认为可读性更倾向于
max(v代表k,v代表值)
或max((v代表k,v代表值))
而不是max(values,key=itemgetter(1))[1]
运算符。itemgetter(1)
是一种等效的、速度更快的拼写方法lambda n:n[1]
。这只返回一个值。@Johnsyweb,对,这只是用来代替OP max_值的计算。编辑以防止误解+1用于nice功能练习。但是,不确定是否更像蟒蛇。