Python 在带有条件的列表中删除词典
我有下面的词典列表,我需要删除具有相同的接收值和客户组值的词典,但保留一个随机项Python 在带有条件的列表中删除词典,python,python-3.x,Python,Python 3.x,我有下面的词典列表,我需要删除具有相同的接收值和客户组值的词典,但保留一个随机项 data = [ { 'id': '16e26a4a9f97fa4f', 'received_on': '2019-11-01 11:05:51', 'customer_group': 'Life-time Buyer' }, { 'id': '16db0dd4a42673e2', 'received_on':
data = [
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
},
{
'id': '16db0dd4199f5897',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
预期产出:
[
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4199f5897',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
这里有一种方法可以获得第一个唯一的datetime,如果您想要随机项,可以像中一样首先洗牌列表 输出:
[{'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'},
{'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'}]
[{'id': '16e26a4a9f97fa4f', 'received_on': '2019-11-01 11:05:51', 'customer_group': 'Life-time Buyer'}, {'id': '16db0dd4199f5897', 'received_on': '2019-10-09 14:12:29', 'customer_group': 'Lead'}]
这里有一种方法可以获得第一个唯一的datetime,如果您想要随机项,可以像中一样首先洗牌列表 输出:
[{'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'},
{'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'}]
[{'id': '16e26a4a9f97fa4f', 'received_on': '2019-11-01 11:05:51', 'customer_group': 'Life-time Buyer'}, {'id': '16db0dd4199f5897', 'received_on': '2019-10-09 14:12:29', 'customer_group': 'Lead'}]
我认为添加到目前为止尚未看到其接收的词典要比筛选出具有重复接收的词典容易:
result = []
receivedList = []
for d in data:
if d['received_on'] not in receivedList:
result.append(d)
receivedList.append(d['received_on'])
print(result)
[{'customer_group': 'Life-time Buyer',
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51'},
{'customer_group': 'Lead',
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29'}]
我认为添加到目前为止尚未看到其接收的词典要比筛选出具有重复接收的词典容易:
result = []
receivedList = []
for d in data:
if d['received_on'] not in receivedList:
result.append(d)
receivedList.append(d['received_on'])
print(result)
[{'customer_group': 'Life-time Buyer',
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51'},
{'customer_group': 'Lead',
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29'}]
这是在新数组中追加的更好方法
data = [
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
},
{
'id': '16db0dd4199f5897',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
unique_received = []
unique_customer_group = []
unique_data = []
for i in data:
if i['customer_group'] not in unique_customer_group:
if i['received_on'] not in unique_received:
unique_data.append(i)
unique_received.append(i['received_on'])
unique_customer_group.append(i['customer_group'])
print(unique_data)
输出
[
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
这是在新数组中追加的更好方法
data = [
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
},
{
'id': '16db0dd4199f5897',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
unique_received = []
unique_customer_group = []
unique_data = []
for i in data:
if i['customer_group'] not in unique_customer_group:
if i['received_on'] not in unique_received:
unique_data.append(i)
unique_received.append(i['received_on'])
unique_customer_group.append(i['customer_group'])
print(unique_data)
输出
[
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
利用上面的一些想法,我还想将
客户组
作为另一个条件,而不是在
上收到。我得到了预期的结果
conditions, result = [], []
for d in data:
condition = (d['received_on'], d['customer_group'])
if condition not in conditions:
result.append(d)
conditions.append(condition)
print(len(result))
利用上面的一些想法,我还想将
客户组
作为另一个条件,而不是在
上收到。我得到了预期的结果
conditions, result = [], []
for d in data:
condition = (d['received_on'], d['customer_group'])
if condition not in conditions:
result.append(d)
conditions.append(condition)
print(len(result))
您可以使用“按自定义键排序”,然后在返回的每个组上使用 对列表进行排序:
keyfunc = lambda x: (x['received_on'], x['customer_group'])
data.sort(key=keyfunc)
分组:
g = itertools.groupby(data, keyfunc)
选择随机元素需要将每个组迭代器转换为一个序列:
result = [random.choice(list(group)) for k, group in g]
通常,我会将键函数分开,特别是因为它使用了两次,并且只将最后两个步骤合并到
result = [random.choice(list(group)) for k, group in itertools.groupby(data, keyfunc)]
但是,您可以使用编写一个庞大、冗余的单行程序:
result = [random.choice(list(group)) for k, group in itertools.groupby(sorted(data, key=lambda x: (x['received_on'], x['customer_group'])), key=lambda x: (x['received_on'], x['customer_group']))]
您可以使用“按自定义键排序”,然后在返回的每个组上使用 对列表进行排序:
keyfunc = lambda x: (x['received_on'], x['customer_group'])
data.sort(key=keyfunc)
分组:
g = itertools.groupby(data, keyfunc)
选择随机元素需要将每个组迭代器转换为一个序列:
result = [random.choice(list(group)) for k, group in g]
通常,我会将键函数分开,特别是因为它使用了两次,并且只将最后两个步骤合并到
result = [random.choice(list(group)) for k, group in itertools.groupby(data, keyfunc)]
但是,您可以使用编写一个庞大、冗余的单行程序:
result = [random.choice(list(group)) for k, group in itertools.groupby(sorted(data, key=lambda x: (x['received_on'], x['customer_group'])), key=lambda x: (x['received_on'], x['customer_group']))]
这里有一个想法:
import random
data = [
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
},
{
'id': '16db0dd4199f5897',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
r_data = data.copy()
random.shuffle(r_data)
unique_data = {(elem['received_on'],elem['customer_group']):elem['id']
for elem in data}
new_data = [{'id':val, 'received_on':key[0],'customer_group':key[1]}
for key,val in unique_data.items()]
new_data = sorted(new_data,key = lambda x:data.index(x)) #if you need sorted
print(new_data)
输出:
[{'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'},
{'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'}]
[{'id': '16e26a4a9f97fa4f', 'received_on': '2019-11-01 11:05:51', 'customer_group': 'Life-time Buyer'}, {'id': '16db0dd4199f5897', 'received_on': '2019-10-09 14:12:29', 'customer_group': 'Lead'}]
这里有一个想法:
import random
data = [
{
'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'
},
{
'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
},
{
'id': '16db0dd4199f5897',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'
}
]
r_data = data.copy()
random.shuffle(r_data)
unique_data = {(elem['received_on'],elem['customer_group']):elem['id']
for elem in data}
new_data = [{'id':val, 'received_on':key[0],'customer_group':key[1]}
for key,val in unique_data.items()]
new_data = sorted(new_data,key = lambda x:data.index(x)) #if you need sorted
print(new_data)
输出:
[{'id': '16e26a4a9f97fa4f',
'received_on': '2019-11-01 11:05:51',
'customer_group': 'Life-time Buyer'},
{'id': '16db0dd4a42673e2',
'received_on': '2019-10-09 14:12:29',
'customer_group': 'Lead'}]
[{'id': '16e26a4a9f97fa4f', 'received_on': '2019-11-01 11:05:51', 'customer_group': 'Life-time Buyer'}, {'id': '16db0dd4199f5897', 'received_on': '2019-10-09 14:12:29', 'customer_group': 'Lead'}]
您可以使用if-else和for-loop-rights添加唯一的,而不要删除重复的。到目前为止您尝试了什么?@MisterMiyagi我刚才发布了一个答案。您可以使用if-else和for-loop-rights添加唯一的,不要删除重复的。到目前为止你尝试了什么?@ MisterMiyagi我在前一段时间发表了一个答案。考虑使用<代码> SET>代码> <代码>条件< /代码>,这不足以满足你自己的随机选择标准。考虑使用<代码> SET>代码> <代码>条件< /代码>这不符合你自己的随机选择标准。