Python 处理字典列表中的重复项
我用DictReader上传了一个csv文件,所以我基本上有一个字典列表。例如,我有一个具有以下内容的被叫阅读器:Python 处理字典列表中的重复项,python,list,dictionary,Python,List,Dictionary,我用DictReader上传了一个csv文件,所以我基本上有一个字典列表。例如,我有一个具有以下内容的被叫阅读器: [{'name': 'Jack', 'hits:' :7, 'misses:': 12, 'year': 10}, {'name': 'Lisa', 'hits': 5, 'misses': 3,' year': 8}, {'name': 'Jack', 'hits': 5, 'misses ':7, 'year': 9}] name = [] hits = [] for
[{'name': 'Jack', 'hits:' :7, 'misses:': 12, 'year': 10},
{'name': 'Lisa', 'hits': 5, 'misses': 3,' year': 8},
{'name': 'Jack', 'hits': 5, 'misses ':7, 'year': 9}]
name = []
hits = []
for row in reader:
name.append(row["name"])
hits.append(row["hits"])
我使用循环创建如下列表:
[{'name': 'Jack', 'hits:' :7, 'misses:': 12, 'year': 10},
{'name': 'Lisa', 'hits': 5, 'misses': 3,' year': 8},
{'name': 'Jack', 'hits': 5, 'misses ':7, 'year': 9}]
name = []
hits = []
for row in reader:
name.append(row["name"])
hits.append(row["hits"])
然而,我不想在我的名单上重复,所以如果有重复的名字,我只对最高年份的名字感兴趣。所以基本上我想以以下几点结束
name = [Jack, Lisa]
hits = [7,5]
做这件事最好的方法是什么试试:
reader = sorted(reader, key = lambda i: i['year'], reverse=True)
name = []
hits = []
for row in reader:
if row['name'] in name:
continue
name.append(row["name"])
hits.append(row["hits"])
想法是根据年份对dict列表进行排序,然后迭代列表。尝试:
reader = sorted(reader, key = lambda i: i['year'], reverse=True)
name = []
hits = []
for row in reader:
if row['name'] in name:
continue
name.append(row["name"])
hits.append(row["hits"])
import pandas as pd
data = [{'name': 'Jack', 'hits' :7, 'misses': 12, 'year': 10},
{'name': 'Lisa', 'hits': 5, 'misses': 3,'year': 8},
{'name': 'Jack', 'hits': 5, 'misses':7, 'year': 9}]
df = pd.DataFrame(data).sort_values(by=['name','year'],ascending=False).groupby('name').first()
dict(zip(df.index,df['hits']))
其思想是根据年份对dict列表进行排序,然后遍历该列表。在纯Python中,无库:
import pandas as pd
data = [{'name': 'Jack', 'hits' :7, 'misses': 12, 'year': 10},
{'name': 'Lisa', 'hits': 5, 'misses': 3,'year': 8},
{'name': 'Jack', 'hits': 5, 'misses':7, 'year': 9}]
df = pd.DataFrame(data).sort_values(by=['name','year'],ascending=False).groupby('name').first()
dict(zip(df.index,df['hits']))
people = {} # maps "name" -> "info"
for record in csv_reader:
# do we have someone with that name already?
old_record = people.get(record['name'], {})
# what's their year (defaulting to -1)
old_year = old_record.get('year', -1)
# if this record is more up to date
if record['year'] > old_year:
# replace the old record
people[record['name']] = record
# -- then, you can pull out your name and year lists
name = list(people.keys())
year = list(r['year'] for r in people.values())
如果你想学熊猫
import pandas as pd
df = pd.read_csv('yourdata.csv')
df.groupby(['name']).max()
在纯Python中,没有库:
people = {} # maps "name" -> "info"
for record in csv_reader:
# do we have someone with that name already?
old_record = people.get(record['name'], {})
# what's their year (defaulting to -1)
old_year = old_record.get('year', -1)
# if this record is more up to date
if record['year'] > old_year:
# replace the old record
people[record['name']] = record
# -- then, you can pull out your name and year lists
name = list(people.keys())
year = list(r['year'] for r in people.values())
如果你想学熊猫
import pandas as pd
df = pd.read_csv('yourdata.csv')
df.groupby(['name']).max()
无熊猫的解决方案: lst=[ {姓名:杰克,命中率:7,未命中率:12,年份:10}, {姓名:丽莎,命中率:5,未命中率:3,年份:8}, {姓名:杰克,命中率:5,未命中率:7,年份:9}, ] out={} 对于lst中的d: out.setdefaultd[名称],]追加 name=[*out] hits=[maxi[hits]表示i-in-v表示v-in-out.values] 打印名 印刷品 印刷品: [‘杰克’、‘丽莎’] [7, 5]
无熊猫的解决方案: lst=[ {姓名:杰克,命中率:7,未命中率:12,年份:10}, {姓名:丽莎,命中率:5,未命中率:3,年份:8}, {姓名:杰克,命中率:5,未命中率:7,年份:9}, ] out={} 对于lst中的d: out.setdefaultd[名称],]追加 name=[*out] hits=[maxi[hits]表示i-in-v表示v-in-out.values] 打印名 印刷品 印刷品: [‘杰克’、‘丽莎’] [7, 5]
这很接近,但是点击列表包含所有值。因此,不会删除名称重复的值。抱歉,这是有效的。这很接近,但是命中列表包含所有值。因此,不删除名称重复的值。很抱歉,这是有效的。