Python 查找共同构成目标列表的最小列表的算法_Python_List_Merge

Python 查找共同构成目标列表的最小列表的算法

python list merge

Python 查找共同构成目标列表的最小列表的算法,python,list,merge,Python,List,Merge,让我们假设我有以下几点： { 'a': [1, 2, 3], 'b': [1, 5], 'c': [3, 4, 5], 'd': [1, 3, 5], 'e': [4] } 所需的结果是['a'，'c'] 因为我想找到哪些数组合并在一起（并删除重复项）表单[1,2,3,4,5] 除了合并在一起形成所需结果的数组之外，我还想找到要合并以获得所需结果的最小数组（因为例如['a'，'d'，'e']也给出了所需结果，但是['a'，'c']是一个更好的解决方案）注：上面的字典只是一

让我们假设我有以下几点：

{
  'a': [1, 2, 3],
  'b': [1, 5],
  'c': [3, 4, 5],
  'd': [1, 3, 5],
  'e': [4]
}

所需的结果是

['a'，'c']

因为我想找到哪些数组合并在一起（并删除重复项）表单

[1,2,3,4,5]

除了合并在一起形成所需结果的数组之外，我还想找到要合并以获得所需结果的最小数组（因为例如

['a'，'d'，'e']

也给出了所需结果，但是

['a'，'c']

是一个更好的解决方案）

注：上面的字典只是一个例子，原来的字典有很多键，每个键都有数百个值。

沿着这条线应该可以用。您需要根据您是否希望对其进行排序等进行一些调整。此解决方案假定顺序无关紧要

从最小到最大打印解决方案：

导入itertools
输入={
“a”：[1,2,3]，
"b":[1,5],，
‘c’：[3,4,5]，
“d”：[1,3,5]，
‘e’：[4]
}
溶液=[1,2,3,4,5]
对于范围内的i（1，len（input.keys（））：
对于itertools中的组合。组合（输入，i）：
pot=列表（集合（itertools.chain.from_iterable（输入[k]表示组合中的k）））
如果pot==溶液：
打印（“这是一个解决方案：”，组合）

这可能不是很优雅，但它比强行通过所有组合更有效。它应该在二次时间内执行，而不是根据条目的数量和数据的传播而成指数的组合

下面的函数找到一个简短的解决方案。最可能（但不一定）是最短的

from collections import Counter

def findMerge(data,target):
    # identify candidate items (i.e. subsets of the target list)
    target     = set(target)
    candidates = {c:group for c,group in data.items() if target.issuperset(group)}
    
    # compute the overlap between candidates and check coverage
    counts = Counter( n for group in candidates.values() for n in group )
    if any(t not in counts for t in target): return [] 

    # identify candidates that are mandatory and the base set they form
    # (i.e. candidates that are the only ones with a given value)
    mandatory = { c:group for c,group in candidates.items()
                  if any(counts[n]==1 for n in group) }
    baseSet   = set().union(*mandatory.values())
    remaining = target - baseSet
    if not remaining: return list(mandatory)
   
    # identify potentially redundant candidates for remaining values
    redundant = [ (c,remaining.intersection(group))
                  for c,group in candidates.items() if c not in mandatory ]

    # remove redundant candidates (smallest first)
    # note: using combinations only on redundant keys may be affordable here
    #       and could be used to return all solutions or ensure shortest
    redundant = sorted(redundant,key=lambda cg:len(cg[1]))
    for r,rGroup in redundant:
        if all(counts[n]>1 for n in rGroup):
            counts.subtract(rGroup)
            del candidates[r]
        
    return list(candidates)

小样本的输出：

data = {
  'a': [1, 2, 3],
  'b': [1, 5],
  'c': [3, 4, 5],
  'd': [1, 3, 5],
  'e': [4]
}

print(findMerge(data,[1,2,3,4,5])) # ['a', 'c']

对于较大的样本，与组合相比，时间差异将是显著的：

data = {
  'a': [1, 2, 3],
  'b': [1, 5, 0],
  'c': [3, 2, 5],
  'd': [1, 3, 5],
  'e': [4],
  'f': [1, 2, 5],
  'g': [1, 5, 8],
  'h': [3, 4, 7],
  'i': [1, 6, 5],
  'j': [4],
  'k': [9],
  'l': [1, 3, 5],
  'm': [4],
  'n': [1, 2, 5],
  'o': [1, 5, 8],
  'p': [3, 4, 7],
  'q': [1, 5, 8],
  'r': [3, 4, 7],
  's': [1, 6, 5],
  't': [4],
  'u': [9],
}

target = [0,1,2,3,4,5,6,7,8,9]
print(findMerge(data,target)) # ['b', 'c', 'q', 'r', 's', 'u']

谢谢，我想itertools是我需要的工具。