Python 多个DICT中的concat字符串值_Python_Dictionary

Python 多个DICT中的concat字符串值

python dictionary

Python 多个DICT中的concat字符串值,python,dictionary,Python,Dictionary,假设我有一个dict列表（每个dict都有相同的键），如下所示： list_of_dicts = [ {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '}, {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"}, {'Id': 4726, 'Body': 'Hello f

假设我有一个dict列表（每个dict都有相同的键），如下所示：

list_of_dicts = [
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}

我只需要将正文、标题和注释部分合并，然后返回一个单独的dict，如下所示：

list_of_dicts = [
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

{'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': None, 'Comments': 'Dallas. Austin Boston'}

请注意，标题为“无”。所以，我们必须小心。这就是我到目前为止所做的…但是，在某个地方失败了…我看不到在哪里

    keys = set().union(*list_of_dicts)
    print(keys)
    k_value = list_of_dicts[0]['Id']
    d_dict = {k: " ".join(str(dic.get(k, '')) for dic in list_of_dicts) for k in keys if k != 'Id'}

    merged_dict = {'Id': k_value}
    merged_dict.update(d_dict)

但是，上面返回的是……我不喜欢：

Final Merged Dict: {'Id': 4726, 'Body': 'Hello from John Hello from Mary Hello from Dylan', 'Title': 'None None None', 'Comments': 'Dallas. Austin Boston'}

首先，我要从

键中删除Id
，以避免在字典理解中跳过它，并在最后使用一个简单的赋值，而不是.update（）

在join
的参数中，当dic[k]
为无时过滤掉。如果join
结果为空字符串（因为所有值都是None
），则在最终结果中将其转换为None

keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value

print(d_dict)

首先，我要从键
中删除Id
，以避免在字典理解中跳过它，并在最后使用一个简单的赋值，而不是.update（）

在join
的参数中，当dic[k]
为无时过滤掉。如果join
结果为空字符串（因为所有值都是None
），则在最终结果中将其转换为None

keys = set().union(*list_of_dicts)
keys.remove('Id')
print(keys)
k_value = list_of_dicts[0]['Id']
d_dict = {k: (" ".join(str(dic[k]) for dic in list_of_dicts if k in dic and dic[k] is not None) or None) for k in keys}
d_dict['Id'] = k_value

print(d_dict)

解析词典列表时，可以将中间结果存储在defaultdict
对象中，以保存字符串值列表。解析完所有字典后，您可以将字符串连接在一起
from collections import defaultdict

dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)

for row in list_of_dicts:
    dd_body[row['Id']].append(row['Body'])
    dd_comments[row['Id']].append(row['Comments'])
    dd_titles[row['Id']].append(row['Title'] or '')  # Effectively removes `None`.

result = []
for id_ in dd_body:  # All three dictionaries have the same keys.
    body = ' '.join(dd_body[id_]).strip()
    comments = ' '.join(dd_comments[id_]).strip()
    titles = ' '.join(dd_titles[id_]).strip() or None
    result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
  'Body': 'Hello from John Hello from Mary Hello from Dylan',
  'Title': None,
  'Comments': 'Dallas.  Austin Boston'}]

在解析字典列表时，可以将中间结果存储在defaultdict
对象中，以保存字符串值列表。解析完所有字典后，您可以将字符串连接在一起
from collections import defaultdict

dd_body = defaultdict(list)
dd_comments = defaultdict(list)
dd_titles = defaultdict(list)

for row in list_of_dicts:
    dd_body[row['Id']].append(row['Body'])
    dd_comments[row['Id']].append(row['Comments'])
    dd_titles[row['Id']].append(row['Title'] or '')  # Effectively removes `None`.

result = []
for id_ in dd_body:  # All three dictionaries have the same keys.
    body = ' '.join(dd_body[id_]).strip()
    comments = ' '.join(dd_comments[id_]).strip()
    titles = ' '.join(dd_titles[id_]).strip() or None
    result.append({'Id': id_, 'Body': body, 'Title': titles, 'Comments': comments})
>>> result
[{'Id': 4726,
  'Body': 'Hello from John Hello from Mary Hello from Dylan',
  'Title': None,
  'Comments': 'Dallas.  Austin Boston'}]

与其他答案相比，它不那么像蟒蛇，但我认为它很容易理解
body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    if dict['Body'] is not None:
        body=body + dict['Body']

    if dict['Title'] is not None:
        title=title + dict['Title']

    if dict ['Comments'] is not None:
        comments=comments + dict['Comments']

if title == "":
    title = None

if body == "":
    body = None

if comments == "":
    comments = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

如果只有标题字段具有“无”选项，则可以通过删除其他字段上的复选框来缩短该字段
body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    body=body + dict['Body']
    comments=comments + dict['Comments']

    if dict['Title'] is not None:
        title=title + dict['Title']

if title == "":
    title = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

与其他答案相比，它不那么像蟒蛇，但我认为它很容易理解
body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    if dict['Body'] is not None:
        body=body + dict['Body']

    if dict['Title'] is not None:
        title=title + dict['Title']

    if dict ['Comments'] is not None:
        comments=comments + dict['Comments']

if title == "":
    title = None

if body == "":
    body = None

if comments == "":
    comments = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

如果只有标题字段具有“无”选项，则可以通过删除其他字段上的复选框来缩短该字段
body, title, comments = "", "", ""
list_of_dicts=[
    {'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
    {'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
    {'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"}]

id = list_of_dicts[0]['Id']

for dict in list_of_dicts:
    body=body + dict['Body']
    comments=comments + dict['Comments']

    if dict['Title'] is not None:
        title=title + dict['Title']

if title == "":
    title = None

record = {'Id': id, 'Body': body, 'Title': title, 'Comments': comments}

对于这种类型的数据操作，pandas
是您的朋友
import pandas as pd

# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)

# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.

# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')

您可以从数据帧中获取多种类型的dict。您需要记录
选项。
对于这种类型的数据操作熊猫
是您的朋友
import pandas as pd

# Your list of dictionaries.
list_of_dicts = [
{'Id': 4726, 'Body': 'Hello from John', 'Title': None, 'Comments': 'Dallas. '},
{'Id': 4726, 'Body': 'Hello from Mary', 'Title': None, 'Comments': "Austin"},
{'Id': 4726, 'Body': 'Hello from Dylan', 'Title': None, 'Comments': "Boston"},
]

# Can be read into a pandas dataframe
df = pd.DataFrame(list_of_dicts)

# Do a database style groupby() and apply the function that you want to each group
group_transformed_df = df.groupby('Id').agg(lambda x: ' '.join(x)).reset_index() # I do reset_index to get a normal DataFrame back.

# DataFrame() -> dict
output_dict = group_transformed_df.to_dict('records')

您可以从数据帧中获取多种类型的dict。您需要记录
选项。
总是标题
具有值无
，还是需要能够忽略任何元素中的此值？标题有时没有，有时有合法字符串。我不应该忽视头衔的价值。我尝试了筛选（无…）。是否总是Title
具有值None
，或者您需要能够在任何元素中忽略此值？Title有时没有，有时有合法字符串。我不应该忽视头衔的价值。我尝试了过滤器（无…）。我在我的演示中修复了它，但没有找到答案.remove（）
在适当的位置修改集合，它不会返回集合。我已经在演示中修复了它，但没有得到答案.remove（）
就地修改集合，它不会返回集合。只看到OP的None
键入pandas.groupby忽略的列。上面的解决方案无法解决这个问题。只看到了OP的None
type列，pandas.groupby忽略了该列。上面的解决方案不能解决这个问题。