Python 使用条件子句合并列表中的列表
我一直在尝试将这个列表与多个列表合并/解析为一个列表 我要分析/合并的列表具有以下格式:Python 使用条件子句合并列表中的列表,python,list,parsing,conditional-statements,Python,List,Parsing,Conditional Statements,我一直在尝试将这个列表与多个列表合并/解析为一个列表 我要分析/合并的列表具有以下格式: list_one = [ [['id1'],['value']], [['id1'],['value1'],['value2'],['value3'],['value4'],['value5']], [['id1'],['value6']], [['id1'],['value7'],['value8']],
list_one = [ [['id1'],['value']],
[['id1'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id1'],['value6']],
[['id1'],['value7'],['value8']],
[['id2'],['value']],
[['id2'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id2'],['value6']],
[['id2'],['value7'],['value8']]
]
我在谷歌上搜索了一下后,找到了以下代码:
pre_info = list(set(i[0] for i in itertools.chain.from_iterable(list_one)))
final_info = list(map(lambda x: [x], sorted(pre_info, key=len)))
print final_info
但它只给我打印身份证
不受限制的输出为:
final_list = [
[['id'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']],
[['id2'],['value'],['value1'],['value2'],['value3'],['value4'],['value5'],['value6'],['value7'],['value8']]
]
每一行的条件显然是“id”,它始终是每个列表的第一个位置。您需要按照唯一的
id
对您的值进行分组,您不能简单地将其展平。您必须使用字典按id
对列表进行分组,或者,如果每个唯一id
的列表是连续的,请使用
使用字典:
by_id = {}
for id, *values in list_one:
# unwrap values as we add them to the id group
by_id.setdefault(id[0], []).extend(v[0] for v in values)
# extract all IDs an value lists into a new list
final_list = [[id] + values for id, values in sorted(by_id.items())]
或Python 2版本:
by_id = {}
for row in list_one:
# unwrap values as we add them to the id group
id, values = row[0][0], row[1:]
by_id.setdefault(id, []).extend(v[0] for v in values)
# extract all IDs an value lists into a new list
final_list = [[id] + values for id, values in sorted(by_id.items())]
我按id对输出列表进行排序;词典没有固有的顺序。注意,我删除了包装单例列表对象;它们占用了您不需要使用的内存,并且在算法上使问题复杂化
如果需要按首次出现的顺序排列这些列表,可以使用for列表
如上所述,如果id
列表已经连续,则可以使用itertools.groupby()
在一个步骤中进行分组:
from itertools import groupby
[[id] + [value[0] for sublist in group for value in sublist[1:]]
for id, group in groupby(list_one, lambda s: s[0][0])]
演示:
如果您觉得必须在输出中包含这些单例列表,请随时将它们重新添加。您可以尝试以下方法:
import collections
list_one = [ [['id1'],['value']],
[['id1'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id1'],['value6']],
[['id1'],['value7'],['value8']],
[['id2'],['value']],
[['id2'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id2'],['value6']],
[['id2'],['value7'],['value8']]
]
d = collections.defaultdict(list)
for row in list_one:
d[row[0][0]].extend(row[1:])
final_output = sorted([[[a]]+b for a, b in d.items()], key = lambda x: int(x[0][0][-1]))
最终输出:
[[['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]
上面的答案提供了很好的解决方案,这里有另一种方法,但我同意@Martijn Pieters的观点♦ 他的解决方案是清晰的阅读
import itertools
chained = itertools.chain.from_iterable(list_one)
schain = set([tuple(c) for c in chained])
{('id',),
('value',),
('value1',),
('value2',),
('value3',),
('value4',),
('value5',),
('value6',),
('value7',),
('value8',)}
list(sorted([list(v) for v in schain]))
[['id'],
['value'],
['value1'],
['value2'],
['value3'],
['value4'],
['value5'],
['value6'],
['value7'],
['value8']]
根据存在其他值进行编辑
temp = [list(v) for v in schain]
temp.pop(temp.index(['id']))
temp.sort()
temp.insert(0, ['id'])
[['id'],
['abc'],
['value'],
['value1'],
['value2'],
['value3'],
['value4'],
['value5'],
['value6'],
['value7'],
['value8']]
我有这个解决方案,但它只在ID为string或int并且必须位于每个列表的开头时才起作用:
l=[ [['id1'],['value']],
[['id1'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id1'],['value6']],
[['id1'],['value7'],['value8']],
[['id2'],['value']],
[['id2'],['value1'],['value2'],['value3'],['value4'],['value5']],
[['id2'],['value6']],
[['id2'],['value7'],['value8']]
]
d={}
for ll in l:
d[ll[0][0]]=[]
for i,ll in enumerate(l):
for lll in ll[1:]:
d[ll[0][0]].append(lll)
result=[]
for key,items in d.iteritems():
result.append([[key]]+items)
print result
结果:
[[['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]
为什么仍然坚持使用每个元素都有一个元素的嵌套列表?为什么不['id','value','value1','value2','value3','value4','value5','value6','value7','value8']
?那些id1
和id2
总是分组在一起(因此连续列表具有相同的id值,没有id的混合)?@MartijnPieters它给了我关于*for id的语法错误,*清单1中的价值观:非常感谢你help@RicardoRibeiro:您是否正在使用Python 2?我添加了一个旧python版本。@MartijnPieters,是的,我使用的是2.0版本。很好的回复,有很多细节和很好的信息。非常感谢@里卡多里贝罗:很高兴能帮上忙!请注意,您只能将一个答案标记为已接受的帖子,而不能同时将两个答案标记为已接受的帖子。选择一个你觉得对你帮助最大的(或者根本没有,选择完全取决于你)。。在我的测试中,它通过了——在真实数据上试一试!非常感谢!
[[['id2'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']], [['id1'], ['value'], ['value1'], ['value2'], ['value3'], ['value4'], ['value5'], ['value6'], ['value7'], ['value8']]]