在python字典列表中合并重复字典以删除重复
我想在我的列表中加入此重复字典以删除重复: 格言:在python字典列表中合并重复字典以删除重复,python,django,dictionary,Python,Django,Dictionary,我想在我的列表中加入此重复字典以删除重复: 格言: name responseTime dateCreated Sample Endpoint 0.551956, 0.535189 11/06/19 13:44, 11/06/19 08:04 Stack Overflow 0.849753, 0.928371 11/06/19 13:44, 11/06/19 08:04 healthc
name responseTime dateCreated
Sample Endpoint 0.551956, 0.535189 11/06/19 13:44, 11/06/19 08:04
Stack Overflow 0.849753, 0.928371 11/06/19 13:44, 11/06/19 08:04
healthcheck 0.600845, 0.369526 11/06/19 13:44, 11/06/19 08:04
预期的dict:
name responseTime dateCreated
Sample Endpoint 0.551956, 0.535189 11/06/19 13:44, 11/06/19 08:04
Stack Overflow 0.849753, 0.928371 11/06/19 13:44, 11/06/19 08:04
healthcheck 0.600845, 0.369526 11/06/19 13:44, 11/06/19 08:04
假设您的数据是python字典(因为我不清楚它是什么): 下面是构建所需词典的代码段:
result = {}
for row in l:
if row["name"] in result:
result[row["name"]]["dateCreated"].append(row["dateCreated"])
result[row["name"]]["responseTime"].append(row["responseTime"])
else:
result[row["name"]] = {
"name": row["name"],
"dateCreated": [row["dateCreated"]],
"responseTime": [row["responseTime"]]
}
print(list(result.values()))
输出:
[{'name': 'healthcheck', 'dateCreated': ['11/06/19 13:44', '11/06/19 08:04'], 'responseTime': [0.600845, 0.369526]}, {'name': 'Stack Overflow', 'dateCreated': ['11/06/19 13:44', '11/06/19 08:04'], 'responseTime': [0.849753, 0.928371]}, {'name': 'Sample Endpoint', 'dateCreated': ['11/06/19 13:44', '11/06/19 08:04'], 'responseTime': [0.559156, 0.535189]}]
当然,这可以用更一般的方法来实现,但我试图使解决方案更简单 您可以使用
数据框而不是字典来表示数据。我把你的字典编辑成了正确的格式,因为看起来你有一个列表。我还将您的responseTime
值从整数转换为字符串,这样它们就可以在join
方法中用逗号正确连接起来。我使用groupby()
方法将重复键分组到单个记录中,并使用agg()
方法聚合/连接值:
import pandas as pd
myDict = {"name": ["healthcheck", "Stack Overflow", "Sample Endpoint", "healthcheck", "Stack Overflow", "Sample Endpoint"],
"responseTime": ["0.600845", "0.849753", "0.559156", "0.369526", "0.928371", "0.535189"],
"dateCreated": ["11/06/19 13:44", "11/06/19 13:44", "11/06/19 13:44", "11/06/19 08:04", "11/06/19 08:04", "11/06/19 08:04"]}
df = pd.DataFrame(myDict)
print(df.groupby("name").agg({'responseTime':', '.join, 'dateCreated':', '.join}))
输出:
name responseTime dateCreated
Sample Endpoint 0.551956, 0.535189 11/06/19 13:44, 11/06/19 08:04
Stack Overflow 0.849753, 0.928371 11/06/19 13:44, 11/06/19 08:04
healthcheck 0.600845, 0.369526 11/06/19 13:44, 11/06/19 08:04
这是你想要组合的JSON吗?不,这是一个字典键值有时会让人困惑我认为建议初学者和简单的任务使用熊猫不是一个好主意,但你仍然可以在回答中提到熊猫。让我试着使用熊猫获取数据
import pandas as pd
myDict = {"name": ["healthcheck", "Stack Overflow", "Sample Endpoint", "healthcheck", "Stack Overflow", "Sample Endpoint"],
"responseTime": ["0.600845", "0.849753", "0.559156", "0.369526", "0.928371", "0.535189"],
"dateCreated": ["11/06/19 13:44", "11/06/19 13:44", "11/06/19 13:44", "11/06/19 08:04", "11/06/19 08:04", "11/06/19 08:04"]}
df = pd.DataFrame(myDict)
print(df.groupby("name").agg({'responseTime':', '.join, 'dateCreated':', '.join}))
name responseTime dateCreated
Sample Endpoint 0.551956, 0.535189 11/06/19 13:44, 11/06/19 08:04
Stack Overflow 0.849753, 0.928371 11/06/19 13:44, 11/06/19 08:04
healthcheck 0.600845, 0.369526 11/06/19 13:44, 11/06/19 08:04