Python 计算第一个数字相似的所有元组值的平均值_Python_Python 3.x_List_Tuples_Mean

Python 计算第一个数字相似的所有元组值的平均值

python python-3.x list

Python 计算第一个数字相似的所有元组值的平均值,python,python-3.x,list,tuples,mean,Python,Python 3.x,List,Tuples,Mean,考虑元组列表 [(7751, 0.9407466053962708), (6631, 0.03942129), (7751, 0.1235432)] 在第一个数字相似的情况下，如何以pythonic方式计算所有元组值的平均值？例如，答案必须是 [(7751, 0.532144902698135), (6631, 0.03942129)] 一种方法是使用collections.defaultdict from collections import defaultdict lst = [(775

考虑元组列表

[(7751, 0.9407466053962708), (6631, 0.03942129), (7751, 0.1235432)]

在第一个数字相似的情况下，如何以pythonic方式计算所有元组值的平均值？例如，答案必须是

[(7751, 0.532144902698135), (6631, 0.03942129)]

一种方法是使用

collections.defaultdict

from collections import defaultdict
lst = [(7751, 0.9407466053962708), (6631, 0.03942129), (7751, 0.1235432)]
d_dict = defaultdict(list)
for k,v in lst:
    d_dict[k].append(v)

[(k,sum(v)/len(v)) for k,v in d_dict.items()]
#[(7751, 0.5321449026981354), (6631, 0.03942129)]

您可以使用

groupby

from itertools import groupby
result = []
for i,g in groupby(sorted(lst),key=lambda x:x[0]):
    grp = list(g)
    result.append((i,sum(i[1] for i in grp)/len(grp)))

使用，

列表理解

def get_avg(g):
    grp = list(g)
    return sum(i[1] for i in grp)/len(grp)

result = [(i,get_avg(g)) for i,g in groupby(sorted(lst),key=lambda x:x[0])]

结果

[(6631, 0.03942129), (7751, 0.5321449026981354)]

groupby

from

itertools

是您的朋友：

>>> l=[(7751, 0.9407466053962708), (6631, 0.03942129), (7751, 0.1235432)] 

>>> #importing libs:
>>> from itertools import groupby
>>> from statistics import mean              #(only python >= 3.4)
>>> # mean=lambda l: sum(l) / float(len(l))  #(for python < 3.4) (*1)

>>> #set the key to group and sort and sorting
>>> k=lambda x: x[0]         
>>> data = sorted(l, key=k)  

>>> #here it is, pythonic way:
>>> [ (k, mean([m[1] for m in g ])) for k, g in groupby(data, k) ]

编辑（*1）感谢您让我参考。
另外，
的意思在哪里？@danihp更新了答案。谢谢。列表理解会更漂亮。@Elmex80sg 是生成器，所以很难获得长度。所以这几乎是不可能的。@elmex80更新了列表理解，numpy有一个平均值。你可以做float（np.mean（zip（*g）[1]）@Elmex80s，是的，你说得对！你知道导入整个numpy是否值得，只是为了得到一个平均值吗？还有一个轻量级的统计模块，它有一个平均值，看起来只有Python 3。 [(6631, 0.03942129), (7751, 0.5321449026981354)]