Python 多个DICT中的concat字符串值
假设我有一个dict列表(每个dict都有相同的键),如下所示:Python 多个DICT中的concat字符串值,python,dictionary,Python,Dictionary,假设我有一个dict列表(每个dict都有相同的键),如下所示: list_of_dicts = [ {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '}, {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"}, {'Id': 4726, 'Body': 'Hello f
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}
我只需要将正文、标题和注释部分合并,然后返回一个单独的dict,如下所示:
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}
请注意,标题为“无”。所以,我们必须小心。这就是我到目前为止所做的…但是,在某个地方失败了…我看不到在哪里
keys = set().union(*list_of_dicts)
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: " ".join(str(dic.get(k, '')) for dic in list_of_dicts) for k in keys if k != 'Id'}
merged_dict = {'Id': k_value}
merged_dict.update(d_dict)
但是,上面返回的是……我不喜欢:
Final Merged Dict: {'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': 'None None None', 'Comments': 'Dallas. Austin Boston'}
首先,我要从
键中删除Id
,以避免在字典理解中跳过它,并在最后使用一个简单的赋值,而不是.update()
在join
的参数中,当dic[k]
为无时过滤掉。如果join
结果为空字符串(因为所有值都是None
),则在最终结果中将其转换为None
keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value
print(d_dict)
首先,我要从键
中删除Id
,以避免在字典理解中跳过它,并在最后使用一个简单的赋值,而不是.update()
在join
的参数中,当dic[k]
为无时过滤掉。如果join
结果为空字符串(因为所有值都是None
),则在最终结果中将其转换为None
keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value
print(d_dict)
解析词典列表时,可以将中间结果存储在defaultdict
对象中,以保存字符串值列表。解析完所有字典后,您可以将字符串连接在一起
from collections import defaultdict
dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)
for row in list_of_dicts:
dd_body[row['Id']].append(row['Body'])
dd_comments[row['Id']].append(row['Comments'])
dd_titles[row['Id']].append(row['Title'] or '') # Effectively removes `None`.
result = []
for id_ in dd_body: # All three dictionaries have the same keys.
body = ' '.join(dd_body[id_]).strip()
comments = ' '.join(dd_comments[id_]).strip()
titles = ' '.join(dd_titles[id_]).strip() or None
result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
'Body': 'Hello from John Hello from Mary Hello from Dylan',
'Title': None,
'Comments': 'Dallas. Austin Boston'}]
在解析字典列表时,可以将中间结果存储在defaultdict
对象中,以保存字符串值列表。解析完所有字典后,您可以将字符串连接在一起
from collections import defaultdict
dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)
for row in list_of_dicts:
dd_body[row['Id']].append(row['Body'])
dd_comments[row['Id']].append(row['Comments'])
dd_titles[row['Id']].append(row['Title'] or '') # Effectively removes `None`.
result = []
for id_ in dd_body: # All three dictionaries have the same keys.
body = ' '.join(dd_body[id_]).strip()
comments = ' '.join(dd_comments[id_]).strip()
titles = ' '.join(dd_titles[id_]).strip() or None
result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
'Body': 'Hello from John Hello from Mary Hello from Dylan',
'Title': None,
'Comments': 'Dallas. Austin Boston'}]
与其他答案相比,它不那么像蟒蛇,但我认为它很容易理解
body, title, comments = "", "", ""
list_of_dicts=[
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
id = list_of_dicts[0]['Id']
for dict in list_of_dicts:
if dict['Body'] is not None:
body=body + dict['Body']
if dict['Title'] is not None:
title=title + dict['Title']
if dict ['Comments'] is not None:
comments=comments + dict['Comments']
if title == "":
title = None
if body == "":
body = None
if comments == "":
comments = None
record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
如果只有标题字段具有“无”选项,则可以通过删除其他字段上的复选框来缩短该字段
body, title, comments = "", "", ""
list_of_dicts=[
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]
id = list_of_dicts[0]['Id']
for dict in list_of_dicts:
body=body + dict['Body']
comments=comments + dict['Comments']
if dict['Title'] is not None:
title=title + dict['Title']
if title == "":
title = None
record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
与其他答案相比,它不那么像蟒蛇,但我认为它很容易理解
body, title, comments = "", "", ""
list_of_dicts=[
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
id = list_of_dicts[0]['Id']
for dict in list_of_dicts:
if dict['Body'] is not None:
body=body + dict['Body']
if dict['Title'] is not None:
title=title + dict['Title']
if dict ['Comments'] is not None:
comments=comments + dict['Comments']
if title == "":
title = None
if body == "":
body = None
if comments == "":
comments = None
record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
如果只有标题字段具有“无”选项,则可以通过删除其他字段上的复选框来缩短该字段
body, title, comments = "", "", ""
list_of_dicts=[
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]
id = list_of_dicts[0]['Id']
for dict in list_of_dicts:
body=body + dict['Body']
comments=comments + dict['Comments']
if dict['Title'] is not None:
title=title + dict['Title']
if title == "":
title = None
record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
对于这种类型的数据操作,pandas
是您的朋友
import pandas as pd
# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)
# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.
# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')
您可以从数据帧中获取多种类型的dict。您需要记录
选项。对于这种类型的数据操作熊猫
是您的朋友
import pandas as pd
# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)
# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.
# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')
您可以从数据帧中获取多种类型的dict。您需要记录
选项。总是标题
具有值无
,还是需要能够忽略任何元素中的此值?标题有时没有,有时有合法字符串。我不应该忽视头衔的价值。我尝试了筛选(无…)。是否总是Title
具有值None
,或者您需要能够在任何元素中忽略此值?Title有时没有,有时有合法字符串。我不应该忽视头衔的价值。我尝试了过滤器(无…)。我在我的演示中修复了它,但没有找到答案.remove()
在适当的位置修改集合,它不会返回集合。我已经在演示中修复了它,但没有得到答案.remove()
就地修改集合,它不会返回集合。只看到OP的None
键入pandas.groupby忽略的列。上面的解决方案无法解决这个问题。只看到了OP的None
type列,pandas.groupby忽略了该列。上面的解决方案不能解决这个问题。