Python 多个DICT中的concat字符串值

Python 多个DICT中的concat字符串值,python,dictionary,Python,Dictionary,假设我有一个dict列表(每个dict都有相同的键),如下所示: list_of_dicts = [ {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '}, {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"}, {'Id': 4726, 'Body': 'Hello f

假设我有一个dict列表(每个dict都有相同的键),如下所示:

list_of_dicts = [
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}
我只需要将正文、标题和注释部分合并,然后返回一个单独的dict,如下所示:

list_of_dicts = [
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]
{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}
请注意,标题为“无”。所以,我们必须小心。这就是我到目前为止所做的…但是,在某个地方失败了…我看不到在哪里

    keys = set().union(*list_of_dicts)
    print(keys)
    k_value = list_of_dicts[0]['Id']
    d_dict = {k: " ".join(str(dic.get(k, '')) for dic in list_of_dicts) for k in keys if k != 'Id'}

    merged_dict = {'Id': k_value}
    merged_dict.update(d_dict)
但是,上面返回的是……我不喜欢:

Final Merged Dict: {'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': 'None None None', 'Comments': 'Dallas. Austin Boston'}

首先,我要从
键中删除
Id
,以避免在字典理解中跳过它,并在最后使用一个简单的赋值,而不是
.update()

join
的参数中,当
dic[k]
为无时过滤掉。如果
join
结果为空字符串(因为所有值都是
None
),则在最终结果中将其转换为
None

keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value

print(d_dict)

首先,我要从
中删除
Id
,以避免在字典理解中跳过它,并在最后使用一个简单的赋值,而不是
.update()

join
的参数中,当
dic[k]
为无时过滤掉。如果
join
结果为空字符串(因为所有值都是
None
),则在最终结果中将其转换为
None

keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value

print(d_dict)

解析词典列表时,可以将中间结果存储在
defaultdict
对象中,以保存字符串值列表。解析完所有字典后,您可以将字符串连接在一起

from collections import defaultdict

dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)

for row in list_of_dicts:
    dd_body[row['Id']].append(row['Body'])
    dd_comments[row['Id']].append(row['Comments'])
    dd_titles[row['Id']].append(row['Title'] or '')  # Effectively removes `None`.

result = []
for id_ in dd_body:  # All three dictionaries have the same keys.
    body = ' '.join(dd_body[id_]).strip()
    comments = ' '.join(dd_comments[id_]).strip()
    titles = ' '.join(dd_titles[id_]).strip() or None
    result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
  'Body': 'Hello from John Hello from Mary Hello from Dylan',
  'Title': None,
  'Comments': 'Dallas.  Austin Boston'}]

在解析字典列表时,可以将中间结果存储在
defaultdict
对象中,以保存字符串值列表。解析完所有字典后,您可以将字符串连接在一起

from collections import defaultdict

dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)

for row in list_of_dicts:
    dd_body[row['Id']].append(row['Body'])
    dd_comments[row['Id']].append(row['Comments'])
    dd_titles[row['Id']].append(row['Title'] or '')  # Effectively removes `None`.

result = []
for id_ in dd_body:  # All three dictionaries have the same keys.
    body = ' '.join(dd_body[id_]).strip()
    comments = ' '.join(dd_comments[id_]).strip()
    titles = ' '.join(dd_titles[id_]).strip() or None
    result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
  'Body': 'Hello from John Hello from Mary Hello from Dylan',
  'Title': None,
  'Comments': 'Dallas.  Austin Boston'}]

与其他答案相比,它不那么像蟒蛇,但我认为它很容易理解

body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    if dict['Body'] is not None:
        body=body + dict['Body']

    if dict['Title'] is not None:
        title=title + dict['Title']

    if dict ['Comments'] is not None:
        comments=comments + dict['Comments']

if title == "":
    title = None

if body == "":
    body = None

if comments == "":
    comments = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
如果只有标题字段具有“无”选项,则可以通过删除其他字段上的复选框来缩短该字段

body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    body=body + dict['Body']
    comments=comments + dict['Comments']

    if dict['Title'] is not None:
        title=title + dict['Title']

if title == "":
    title = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

与其他答案相比,它不那么像蟒蛇,但我认为它很容易理解

body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    if dict['Body'] is not None:
        body=body + dict['Body']

    if dict['Title'] is not None:
        title=title + dict['Title']

    if dict ['Comments'] is not None:
        comments=comments + dict['Comments']

if title == "":
    title = None

if body == "":
    body = None

if comments == "":
    comments = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}
如果只有标题字段具有“无”选项,则可以通过删除其他字段上的复选框来缩短该字段

body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    body=body + dict['Body']
    comments=comments + dict['Comments']

    if dict['Title'] is not None:
        title=title + dict['Title']

if title == "":
    title = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

对于这种类型的数据操作,
pandas
是您的朋友

import pandas as pd

# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)

# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.

# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')

您可以从数据帧中获取多种类型的dict。您需要
记录
选项。

对于这种类型的数据操作
熊猫
是您的朋友

import pandas as pd

# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)

# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.

# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')

您可以从数据帧中获取多种类型的dict。您需要
记录
选项。

总是
标题
具有值
,还是需要能够忽略任何元素中的此值?标题有时没有,有时有合法字符串。我不应该忽视头衔的价值。我尝试了筛选(无…)。是否总是
Title
具有值
None
,或者您需要能够在任何元素中忽略此值?Title有时没有,有时有合法字符串。我不应该忽视头衔的价值。我尝试了过滤器(无…)。我在我的演示中修复了它,但没有找到答案
.remove()
在适当的位置修改集合,它不会返回集合。我已经在演示中修复了它,但没有得到答案
.remove()
就地修改集合,它不会返回集合。只看到OP的
None
键入pandas.groupby忽略的列。上面的解决方案无法解决这个问题。只看到了OP的
None
type列,pandas.groupby忽略了该列。上面的解决方案不能解决这个问题。