Python 如果发现重复项，则取列表中值的平均值_Python_Arrays_List_For Loop_Append

Python 如果发现重复项，则取列表中值的平均值

python arrays list for-loop

Python 如果发现重复项，则取列表中值的平均值,python,arrays,list,for-loop,append,Python,Arrays,List,For Loop,Append,我有两个相互关联的列表。例如，在这里，“John”与“1”关联，“Bob”与“4”关联，依此类推： l1 = ['John', 'Bob', 'Stew', 'John'] l2 = [1, 4, 7, 3] 我的问题是重复的约翰。我不想添加重复的John，而是想取与Johns相关的值的平均值，即1和3，即（3+1）/2=2。因此，我希望清单实际上是： l1 = ['John', 'Bob', 'Stew'] l2 = [2, 4, 7] 我曾尝试过一些解决方案，包括for循环和“包含”函数

我有两个相互关联的列表。例如，在这里，“John”与“1”关联，“Bob”与“4”关联，依此类推：

l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]

我的问题是重复的约翰。我不想添加重复的John，而是想取与Johns相关的值的平均值，即1和3，即（3+1）/2=2。因此，我希望清单实际上是：

l1 = ['John', 'Bob', 'Stew']
l2 = [2, 4, 7]

我曾尝试过一些解决方案，包括for循环和“包含”函数，但似乎无法将其组合起来。我对Python不是很有经验，但链表听起来似乎可以用于此

谢谢你

以下内容可能会给你一个想法。它使用一个

OrderedDict

，假设您希望项目按原始列表中的显示顺序排列：

from collections import OrderedDict

d = OrderedDict()
for x, y in zip(l1, l2):
    d.setdefault(x, []).get(x).append(y)
# OrderedDict([('John', [1, 3]), ('Bob', [4]), ('Stew', [7])])


names, values = zip(*((k, sum(v)/len(v)) for k, v in d.items()))
# ('John', 'Bob', 'Stew')
# (2.0, 4.0, 7.0)

下面可能会给你一个想法。它使用一个

OrderedDict

，假设您希望项目按原始列表中的显示顺序排列：

from collections import OrderedDict

d = OrderedDict()
for x, y in zip(l1, l2):
    d.setdefault(x, []).get(x).append(y)
# OrderedDict([('John', [1, 3]), ('Bob', [4]), ('Stew', [7])])


names, values = zip(*((k, sum(v)/len(v)) for k, v in d.items()))
# ('John', 'Bob', 'Stew')
# (2.0, 4.0, 7.0)

我认为你应该使用一个dict）

这是一个使用dict的简短版本

final_dict = {}
l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]

for i in range(len(l1)):
    if final_dict.get(l1[i]) == None:
        final_dict[l1[i]] = l2[i]
    else:
        final_dict[l1[i]] = int((final_dict[l1[i]] + l2[i])/2)


print(final_dict)

这是一个使用dict的简短版本

final_dict = {}
l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]

for i in range(len(l1)):
    if final_dict.get(l1[i]) == None:
        final_dict[l1[i]] = l2[i]
    else:
        final_dict[l1[i]] = int((final_dict[l1[i]] + l2[i])/2)


print(final_dict)

大概是这样的：

#!/usr/bin/python
l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]
d={}
for i in range(0, len(l1)):
    key = l1[i]
    if d.has_key(key):
         d[key].append(l2[i])
    else:
         d[key] = [l2[i]]
r = []
for values in d.values():
    r.append((key,sum(values)/len(values)))
print r

大概是这样的：

#!/usr/bin/python
l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]
d={}
for i in range(0, len(l1)):
    key = l1[i]
    if d.has_key(key):
         d[key].append(l2[i])
    else:
         d[key] = [l2[i]]
r = []
for values in d.values():
    r.append((key,sum(values)/len(values)))
print r

希望下面的代码有帮助

l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]

def remove_repeating_names(names_list, numbers_list):
    new_names_list = []
    new_numbers_list = []
    for first_index, first_name in enumerate(names_list):
        amount_of_occurencies = 1
        number = numbers_list[first_index]
        for second_index, second_name in enumerate(names_list):
            # Check if names match and
            # if this name wasn't read in earlier cycles or is not same element.
            if (second_name == first_name):
                if (first_index < second_index):
                    number += numbers_list[second_index]
                    amount_of_occurencies += 1
            # Break the loop if this name was read earlier.
                elif (first_index > second_index):
                    amount_of_occurencies = -1
                    break
        if amount_of_occurencies is not -1:
            new_names_list.append(first_name)
            new_numbers_list.append(number/amount_of_occurencies)
    return [new_names_list, new_numbers_list]

# Unmodified arrays
print(l1)
print(l2)

l1, l2 = remove_repeating_names(l1, l2)

# If you want numbers list to be integer, not float, uncomment following line:
# l2 = [int(number) for number in l2]

# Modified arrays
print(l1)
print(l2)

l1=['John'，'Bob'，'Stew'，'John']
l2=[1,4,7,3]
def删除重复的名称（名称列表、数字列表）：
新名称列表=[]
新号码列表=[]
对于第一个索引，枚举中的第一个名称（名称列表）：
发生次数的数量=1
编号=编号列表[第一个索引]
对于第二个索引，枚举中的第二个名称（名称列表）：
#检查名称是否匹配，以及
#如果此名称未在早期周期中读取或不是同一元素。
如果（第二个名称==第一个名称）：
如果（第一个指数<第二个指数）：
数字+=数字列表[第二个索引]
发生次数的数量+=1
#如果先前已读取此名称，请中断循环。
elif（第一个索引>第二个索引）：
发生次数的数量=-1
打破
如果发生次数的数量不是-1：
新名称列表。追加（第一个名称）
新编号列表。追加（发生次数/金额）
返回[新名称列表，新编号列表]
#未修改的阵列
打印（l1）
打印（l2）
l1，l2=删除重复的名称（l1，l2）
#如果希望数字列表为整数而不是浮点，请取消对以下行的注释：
#l2=[int（number）表示l2中的数字]
#修改数组
打印（l1）
打印（l2）

希望下面的代码有帮助

l1 = ['John', 'Bob', 'Stew', 'John']
l2 = [1, 4, 7, 3]

def remove_repeating_names(names_list, numbers_list):
    new_names_list = []
    new_numbers_list = []
    for first_index, first_name in enumerate(names_list):
        amount_of_occurencies = 1
        number = numbers_list[first_index]
        for second_index, second_name in enumerate(names_list):
            # Check if names match and
            # if this name wasn't read in earlier cycles or is not same element.
            if (second_name == first_name):
                if (first_index < second_index):
                    number += numbers_list[second_index]
                    amount_of_occurencies += 1
            # Break the loop if this name was read earlier.
                elif (first_index > second_index):
                    amount_of_occurencies = -1
                    break
        if amount_of_occurencies is not -1:
            new_names_list.append(first_name)
            new_numbers_list.append(number/amount_of_occurencies)
    return [new_names_list, new_numbers_list]

# Unmodified arrays
print(l1)
print(l2)

l1, l2 = remove_repeating_names(l1, l2)

# If you want numbers list to be integer, not float, uncomment following line:
# l2 = [int(number) for number in l2]

# Modified arrays
print(l1)
print(l2)

l1=['John'，'Bob'，'Stew'，'John']
l2=[1,4,7,3]
def删除重复的名称（名称列表、数字列表）：
新名称列表=[]
新号码列表=[]
对于第一个索引，枚举中的第一个名称（名称列表）：
发生次数的数量=1
编号=编号列表[第一个索引]
对于第二个索引，枚举中的第二个名称（名称列表）：
#检查名称是否匹配，以及
#如果此名称未在早期周期中读取或不是同一元素。
如果（第二个名称==第一个名称）：
如果（第一个指数<第二个指数）：
数字+=数字列表[第二个索引]
发生次数的数量+=1
#如果先前已读取此名称，请中断循环。
elif（第一个索引>第二个索引）：
发生次数的数量=-1
打破
如果发生次数的数量不是-1：
新名称列表。追加（第一个名称）
新编号列表。追加（发生次数/金额）
返回[新名称列表，新编号列表]
#未修改的阵列
打印（l1）
打印（l2）
l1，l2=删除重复的名称（l1，l2）
#如果希望数字列表为整数而不是浮点，请取消对以下行的注释：
#l2=[int（number）表示l2中的数字]
#修改数组
打印（l1）
打印（l2）

Stew与7没有关联？也许你需要的是一个dict。你尝试过吗？@schwobasegl抱歉，修复：）@bla是的，我尝试过dict，但问题是，由于键只能是唯一的，所以我没有机会取l2中关联值的平均值，因为它会自动拒绝重复的值。@您可以尝试列出与

'John'

相关的值，然后根据需要取平均值。看看这个在dict中向同一个键添加多个值的答案：Stew与7不相关吗？也许你需要的是一个dict。你试过了吗？@schwobasegl抱歉，修复：）@bla是的，我试过dict，但问题是，因为键只能是唯一的，所以我没有机会获得l2中相关值的平均值，因为它会自动拒绝重复的值。@您可以尝试列出与

'John'

相关的值，然后根据需要取平均值。看看这个答案，在dict:AttributeError中向同一个键添加多个值：“dict”对象没有属性“iteritems”？@mythiccoa您可能正在运行python 3.x。Python3的等价物是

.items（）

.Good know:）。优雅的解决方案，工作完美，谢谢！您好，我很好奇，如果取中位数而不是平均值，会不会使代码更复杂？为什么会这样？不过这会有点（有点）复杂。您尝试了什么？AttributeError:“dict”对象没有属性“iteritems”？@MythicCocoa您可能正在运行python 3.x。Python3的等价物是

.items（）

.Good know:）。优雅的解决方案，工作完美，谢谢！您好，我很好奇，如果取中位数而不是平均值，会不会使代码更复杂？为什么会这样？不过这会有点（有点）复杂。你试过什么？