Algorithm 相交集,结果是一组具有集体唯一元素的集
假设我有以下几套:Algorithm 相交集,结果是一组具有集体唯一元素的集,algorithm,set,Algorithm,Set,假设我有以下几套: X -> {1, 2, 3} Y -> {1, 4, 7} Z -> {1, 4, 5} 我在寻找交叉点的组合,这些交叉点产生许多集合,其中每个元素在所有集合中都是唯一的。(实际上是一组散列,其中每个元素引用回它相交的集合): 解决问题时,必须满足以下条件: 对于每个初始集合,每个元素都将位于由最大初始集合数的交点创建的结果集合中 也就是说,初始集合中的每个元素都必须恰好位于一个结果集合中 集合实际上是无限的,这意味着遍历所有有效元素是不可行的,
X -> {1, 2, 3}
Y -> {1, 4, 7}
Z -> {1, 4, 5}
我在寻找交叉点的组合,这些交叉点产生许多集合,其中每个元素在所有集合中都是唯一的。(实际上是一组散列,其中每个元素引用回它相交的集合):
解决问题时,必须满足以下条件:
- 对于每个初始集合,每个元素都将位于由最大初始集合数的交点创建的结果集合中
- 也就是说,初始集合中的每个元素都必须恰好位于一个结果集合中
- 集合实际上是无限的,这意味着遍历所有有效元素是不可行的,但是集合操作是很好的
- 不包含任何元素的所有结果集都可以忽略
resulting_sets = {}
for sets in powerset(S):
s = intersection(sets)
for rs in resulting_sets.keys():
s -= rs
if not s.empty():
resulting_sets[s] = sets # realistically some kind of reference to sets
当然,在设置操作的O(n^2log(n))O(2^n*2^(n/2))时,上述操作效率非常低(就我而言,它可能已经运行了n^2次)。对于这种类型的问题有更好的解决方案吗?更新:不迭代任何集合,只使用集合操作 该算法以建设性的方式构建结果集,即每次看到新的源集时,我们修改现有的唯一元素集和/或添加新的元素集 其思想是,每一个新的集合都可以分为两部分,一部分包含已经看到的值,另一部分包含新的唯一值。对于第一部分,它被当前结果集进一步划分为不同的子集(最多#个SEW源集的功率集)。对于每个这样的子集,它也分成两部分,一部分与新的源集相交,另一部分不相交。任务是更新这些类别的结果集 对于集合运算的复杂性,这应该是O(n*2^n)。对于OP发布的解决方案,我认为复杂性应该是O(2^(2n)),因为
len(结果集)
在最坏的情况下最多有2^n个元素
def solution(sets):
result_sets = [] # list of (unique element set, membership) tuples
for sid, s in enumerate(sets):
new_sets = []
for unique_elements, membership in result_sets:
# The intersect part has wider membership, while the other part
# has less unique elements (maybe empty).
# Wider membership must have not been seen before, so add as new.
intersect = unique_elements & s
# Special case if all unique elements exist in s, then update
# in place
if len(intersect) == len(unique_elements):
membership.append(sid)
elif len(intersect) != 0:
unique_elements -= intersect
new_sets.append((intersect, membership + [sid]))
s -= intersect
if len(s) == 0:
break
# Special syntax for Python: there are remaining elements in s
# This is the part of unseen elements: add as a new result set
else:
new_sets.append((s, [sid]))
result_sets.extend(new_sets)
print(result_sets)
sets = [{1, 2, 3}, {1, 4, 7}, {1, 4, 5}]
solution(sets)
# output:
# [(set([2, 3]), [0]), (set([1]), [0, 1, 2]), (set([7]), [1]), (set([4]), [1, 2]), (set([5]), [2])]
def solution(sets):
union = set().union(*sets)
numSets = len(sets)
numElements = len(union)
memberships = {}
for e in union:
membership = tuple(i for i, s in enumerate(sets) if e in s)
if membership not in memberships:
memberships[membership] = []
memberships[membership].append(e)
print(memberships)
sets = [{1, 2, 3}, {1, 4, 7}, {1, 4, 5}]
solution(sets)
# output:
# {(0, 1, 2): [1], (1, 2): [4], (0,): [2, 3], (1,): [7], (2,): [5]}
---------------原始答案如下---------------
其思想是找到每个独特元素的“成员”,即它属于什么集合。然后,我们创建一个字典,根据其成员资格对所有元素进行分组,生成请求的集合。复杂度是O(n*len(sets)),或者在最坏的情况下是O(n^2)
def solution(sets):
result_sets = [] # list of (unique element set, membership) tuples
for sid, s in enumerate(sets):
new_sets = []
for unique_elements, membership in result_sets:
# The intersect part has wider membership, while the other part
# has less unique elements (maybe empty).
# Wider membership must have not been seen before, so add as new.
intersect = unique_elements & s
# Special case if all unique elements exist in s, then update
# in place
if len(intersect) == len(unique_elements):
membership.append(sid)
elif len(intersect) != 0:
unique_elements -= intersect
new_sets.append((intersect, membership + [sid]))
s -= intersect
if len(s) == 0:
break
# Special syntax for Python: there are remaining elements in s
# This is the part of unseen elements: add as a new result set
else:
new_sets.append((s, [sid]))
result_sets.extend(new_sets)
print(result_sets)
sets = [{1, 2, 3}, {1, 4, 7}, {1, 4, 5}]
solution(sets)
# output:
# [(set([2, 3]), [0]), (set([1]), [0, 1, 2]), (set([7]), [1]), (set([4]), [1, 2]), (set([5]), [2])]
def solution(sets):
union = set().union(*sets)
numSets = len(sets)
numElements = len(union)
memberships = {}
for e in union:
membership = tuple(i for i, s in enumerate(sets) if e in s)
if membership not in memberships:
memberships[membership] = []
memberships[membership].append(e)
print(memberships)
sets = [{1, 2, 3}, {1, 4, 7}, {1, 4, 5}]
solution(sets)
# output:
# {(0, 1, 2): [1], (1, 2): [4], (0,): [2, 3], (1,): [7], (2,): [5]}
谢谢但不幸的是,我不能实际地循环每个元素。集合本身包含多达2^64个元素,并通过排除集合和元素范围(而不是每个单独的元素)对其进行跟踪。其中一个要点指出constraint@Lindenk你是在计算集合运算的复杂性吗?请注意,集合交集是O(min(M,N)),而根据元素的#而言,成员测试是O(1)。哎哟,你说得对。出于某种原因,我假设一个功率集是n^2,而实际上是2^n。哇,那更糟