Python 从列表列表中，如何查找具有n个公共元素的列表_Python_List_Subset

Python 从列表列表中，如何查找具有n个公共元素的列表

python list

Python 从列表列表中，如何查找具有n个公共元素的列表,python,list,subset,Python,List,Subset,我有一份清单 foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]] 我感兴趣的是找到共享n个元素的所有可能的列表组预期结果：列出共享的3个元素包含[1,2,4]的组：[[1,2,4,17]，[1,2,4,12]] 包含[1,2,17]的组：[[1,2,4,17]，[1,2,5,17]] 列出共享的2个元素包含[1,2]的组：[1,2,4,17]，[1,2,4,12]，[1,2,5,17] 包含[1,4]的组：[[1,2,4,17]

我有一份清单

foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]]

我感兴趣的是找到共享n个元素的所有可能的列表组
预期结果：
列出共享的3个元素

包含[1,2,4]的组：
[[1,2,4,17]，[1,2,4,12]]

包含[1,2,17]的组：
[[1,2,4,17]，[1,2,5,17]]

列出共享的2个元素

包含[1,2]的组：
[1,2,4,17]，[1,2,4,12]，[1,2,5,17]

包含[1,4]的组：
[[1,2,4,17]，[1,2,4,12]]

包含[2,4]的组：
[[1,2,4,17]，[1,2,4,12]]

包含[1,17]的组：
[[1,2,4,17]，[1,2,5,17]]

包含[2,17]的组：
[[1,2,4,17]，[1,2,5,17]]

我到目前为止所做的尝试

列表的交集：它不能解决我的问题，因为我无法控制我希望列表共享的元素数量

我想尝试的内容似乎非常复杂，而且必须是更方便的方式

对于每个列表，定义列表的所有组合减去一个元素（即子集）

循环遍历剩余的列表，并将包含子集的列表存储在字典中

我在这里发布了一个最小的示例，但实际上我有一个由上百个列表组成的列表，每个列表包含6个元素，所以我也害怕组合爆炸
如果有人能提供一些指导或技巧，那就太好了
非常感谢,
最佳
这将帮助您：

from itertools import combinations def isSubset(comb,lst): it = iter(lst) return all(c in it for c in comb) foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]] n = 3 print('-'*100) print(f"n = {n}") existing = [] for index in range(len(foo)): combs = combinations(foo[index],n) for comb in combs: occurrences = 0 curr_lst = [] for lst in foo: if isSubset(comb,lst): if comb not in existing: occurrences += 1 curr_lst.append(lst) if occurrences >= 2: if occurrences == 2: print('-' * 100) print(f"Groups containing {comb}") [print(elem) for elem in curr_lst] else: print(lst) existing.append(comb)
输出：

---------------------------------------------------------------------------------------------------- n = 3 ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 17) [1, 2, 4, 17] [1, 2, 5, 17]

[((1, 2, 4), [[1, 2, 4, 17], [1, 2, 4, 12]]), ((1, 2, 17), [[1, 2, 4, 17], [1, 2, 5, 17]])]

n=2的输出： ---------------------------------------------------------------------------------------------------- n = 2 ---------------------------------------------------------------------------------------------------- Groups containing (1, 2) [1, 2, 4, 17] [1, 2, 4, 12] [1, 2, 5, 17] ---------------------------------------------------------------------------------------------------- Groups containing (1, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (1, 17) [1, 2, 4, 17] [1, 2, 5, 17] ---------------------------------------------------------------------------------------------------- Groups containing (2, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (2, 17) [1, 2, 4, 17] [1, 2, 5, 17] 这将有助于您： from itertools import combinations def isSubset(comb,lst): it = iter(lst) return all(c in it for c in comb) foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]] n = 3 print('-'*100) print(f"n = {n}") existing = [] for index in range(len(foo)): combs = combinations(foo[index],n) for comb in combs: occurrences = 0 curr_lst = [] for lst in foo: if isSubset(comb,lst): if comb not in existing: occurrences += 1 curr_lst.append(lst) if occurrences >= 2: if occurrences == 2: print('-' * 100) print(f"Groups containing {comb}") [print(elem) for elem in curr_lst] else: print(lst) existing.append(comb) 输出： ---------------------------------------------------------------------------------------------------- n = 3 ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 17) [1, 2, 4, 17] [1, 2, 5, 17] [((1, 2, 4), [[1, 2, 4, 17], [1, 2, 4, 12]]), ((1, 2, 17), [[1, 2, 4, 17], [1, 2, 5, 17]])] n=2的输出： ---------------------------------------------------------------------------------------------------- n = 2 ---------------------------------------------------------------------------------------------------- Groups containing (1, 2) [1, 2, 4, 17] [1, 2, 4, 12] [1, 2, 5, 17] ---------------------------------------------------------------------------------------------------- Groups containing (1, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (1, 17) [1, 2, 4, 17] [1, 2, 5, 17] ---------------------------------------------------------------------------------------------------- Groups containing (2, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (2, 17) [1, 2, 4, 17] [1, 2, 5, 17] 我不认为有一种简单的方法可以为每个内部列表生成包含n个元素的所有组合。但每个组合只需检查一次： from itertools import combinations import random n = 3 foo = [[random.randint(0,100) for _ in range(6)] for _ in range(500)] #foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]] checked = set() # already checked combinations result = [] for lst in foo: cbs = combinations(lst, n) for comb in cbs: if not comb in checked: groups = [l for l in foo if all(i in l for i in comb)] if len(groups) > 1: result.append((comb, groups)) checked.add(comb) print(result) 输出： ---------------------------------------------------------------------------------------------------- n = 3 ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 17) [1, 2, 4, 17] [1, 2, 5, 17] [((1, 2, 4), [[1, 2, 4, 17], [1, 2, 4, 12]]), ((1, 2, 17), [[1, 2, 4, 17], [1, 2, 5, 17]])] 性能：对于随机生成的包含500个子列表、子列表中0-100或0-1000以及n=2 或n=3 的值的列表，代码需要几秒钟才能完成。我认为没有一种简单的方法可以为每个内部列表生成包含n个元素的所有组合。但每个组合只需检查一次： from itertools import combinations import random n = 3 foo = [[random.randint(0,100) for _ in range(6)] for _ in range(500)] #foo = [[1, 2, 4, 17], [1, 2, 4, 12], [1, 2, 5, 17]] checked = set() # already checked combinations result = [] for lst in foo: cbs = combinations(lst, n) for comb in cbs: if not comb in checked: groups = [l for l in foo if all(i in l for i in comb)] if len(groups) > 1: result.append((comb, groups)) checked.add(comb) print(result) 输出： ---------------------------------------------------------------------------------------------------- n = 3 ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 4) [1, 2, 4, 17] [1, 2, 4, 12] ---------------------------------------------------------------------------------------------------- Groups containing (1, 2, 17) [1, 2, 4, 17] [1, 2, 5, 17] [((1, 2, 4), [[1, 2, 4, 17], [1, 2, 4, 12]]), ((1, 2, 17), [[1, 2, 4, 17], [1, 2, 5, 17]])] 性能：对于随机生成的包含500个子列表、子列表中0-100或0-1000以及n=2 或n=3 的值的列表，代码需要几秒钟才能完成。 “列表的交集：它不能回答我的问题，因为我无法控制希望列表共享的元素数量。”实际上，如果您找到两个列表的最大公共子集，这两个列表将共享最大公共子集的所有元素组合。例如，列表1和列表2有[1,2,3,4,5]的交集，因此它们共享[1,2]，[1,3]，[1,5]，[2,1]，[4,5]，[1,2,3]，[3,4,5]，[1,2,3,4]，[2,3,4,5]，因此您可以减少问题，找到两个列表交集的组合。你认为呢？重复值是可能的吗，比如[1,1,2,3] ？我认为这比迭代所有值要好combinations@Wups是的，重复的值是possible@Bruck1701谢谢你的回答，如果我理解正确，你会建议每个列表计算与剩余列表相同的最大子集吗？“列表的交集：它不能解决我的问题，因为我无法控制希望列表共享的元素数量。“实际上，如果您找到两个列表的最大公共子集，这两个列表将共享最大公共子集元素的所有组合。例如，列表1和列表2具有[1,2,3,4,5]的交集，因此它们共享：[1,2]，[1,3]，[1,5]，[2,1]，[4,5]，[1,2,3]，[3,4]，[1,2,3,4,5]因此，您可以减少您的问题，找到两个列表的交集的组合。您认为如何？是否可能出现重复值，如[1,1,2,3] ？我认为这比迭代所有combinations@Wups是的，重复的值是possible@Bruck1701谢谢你的回答，如果我理解正确，你会建议每个列表计算与剩余列表相同的最大子集？@python\u learner:那么你知道更好的解决方案吗？有些问题无法通过虽然复杂性呈指数级增长。我没有否决投票，所以我不知道你为什么会要求我提供更好的解决方案。也没有否决投票，但我同意，如果内部列表中的数字范围太大，对于数百个列表来说，它就成了一个问题problem@Sushil到目前为止，这绝对是最接近我需要的。我会试试看ep您已更新。如果子列表是[1,1,2,3] ，则组合可以是[1,1] ，则转换为集合是{1} ，如果我理解正确，它会给出一个错误的结果。@python\u learner:那么你知道更好的解决方案吗？有些问题不可能在复杂性呈指数级增长的情况下得到解决。我没有投反对票，所以我不确定你为什么会要求我提供更好的解决方案，也没有投反对票，但我同意如果st太大了，对于数百个列表来说，它变成了一个problem@Sushil到目前为止，这绝对是最接近我需要的。我会尝试一下，并让您不断更新。如果子列表是[1,1,2,3] ，则组合可以是[1,1] ，转换为一个集合就是{1} ，如果我理解正确，它会给出错误的结果。谢谢@wups，我认为这是最符合我需要的答案。性能也很好（我当前的数据不到一秒）。谢谢@wups，我认为这是最符合我需要的答案。性能也很好（我当前的数据不到一秒）。