Python 合并元组（如果它们有一个公共元素）_Python

Python 合并元组（如果它们有一个公共元素）

python

Python 合并元组（如果它们有一个公共元素）,python,Python,考虑以下列表： tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')] 我怎样才能做到这一点 new_tuple_list = [('c', 'e', 'd'), ('a', 'b')] 我试过： for tuple in tuple_list: for tup in tuple_list: if tuple[0] == tup[0]: new_tup = (tuple[0],

考虑以下列表：

tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]

我怎样才能做到这一点

new_tuple_list = [('c', 'e', 'd'), ('a', 'b')]

我试过：

for tuple in tuple_list:
    for tup in tuple_list:
        if tuple[0] == tup[0]:
            new_tup = (tuple[0],tuple[1],tup[1])
            new_tuple_list.append(new_tup)

但它只有在元组的元素按一定顺序排列时才起作用，这意味着它将产生以下结果：

new_tuple_list = [('c', 'e', 'd'), ('a', 'b'), ('d', 'e')]

你可以把元组当作图形中的边，把你的目标看成是图中的发现。然后，您可以简单地在顶点（元组中的项）上循环，并针对尚未访问的每个顶点执行DFS以生成组件：

from collections import defaultdict

def dfs(adj_list, visited, vertex, result, key):
    visited.add(vertex)
    result[key].append(vertex)
    for neighbor in adj_list[vertex]:
        if neighbor not in visited:
            dfs(adj_list, visited, neighbor, result, key)

edges = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]

adj_list = defaultdict(list)
for x, y in edges:
    adj_list[x].append(y)
    adj_list[y].append(x)

result = defaultdict(list)
visited = set()
for vertex in adj_list:
    if vertex not in visited:
        dfs(adj_list, visited, vertex, result, vertex)

print(result.values())

输出：

[['a', 'b'], ['c', 'e', 'd']]

请注意，在上面的示例中，组件和组件中的元素都是随机排列的。

这是一个糟糕的性能，因为列表中包含的检查是

O（n）

，但它很短：

result = []

for tup in tuple_list:
    for idx, already in enumerate(result):
        # check if any items are equal
        if any(item in already for item in tup):
            # tuples are immutable so we need to set the result item directly
            result[idx] = already + tuple(item for item in tup if item not in already)
            break
    else:
        # else in for-loops are executed only if the loop wasn't terminated by break
        result.append(tup)

这有一个很好的副作用，就是保持订单：

>>> result
[('c', 'e', 'd'), ('a', 'b')]

使用集合。您正在检查（最初很小）集合的重叠和累积，Python为此提供了一种数据类型：

#!python3

#tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]
tuple_list = [(1,2), (3,4), (5,), (1,3,5), (3,'a'),
        (9,8), (7,6), (5,4), (9,'b'), (9,7,4),
        ('c', 'e'), ('e', 'f'), ('d', 'e'), ('d', 'f'),
        ('a', 'b'),
        ]
set_list = []

print("Tuple list:", tuple_list)
for t in tuple_list:
    #print("Set list:", set_list)
    tset = set(t)
    matched = []
    for s in set_list:
        if tset & s:
            s |= tset
            matched.append(s)

    if not matched:
        #print("No matches. New set: ", tset)
        set_list.append(tset)

    elif len(matched) > 1:
        #print("Multiple Matches: ", matched)
        for i,iset in enumerate(matched):
            if not iset:
                continue
            for jset in matched[i+1:]:
                if iset & jset:
                    iset |= jset
                    jset.clear()

set_list = [s for s in set_list if s]
print('\n'.join([str(s) for s in set_list]))

如果您不需要重复的值（例如，能够保留

['a'，'a'，'b']

），这是一种通过集合实现所需操作的简单而快速的方法：

iset = set([frozenset(s) for s in tuple_list])  # Convert to a set of sets
result = []
while(iset):                  # While there are sets left to process:
    nset = set(iset.pop())      # Pop a new set
    check = len(iset)           # Does iset contain more sets
    while check:                # Until no more sets to check:
        check = False
        for s in iset.copy():       # For each other set:
            if nset.intersection(s):  # if they intersect:
                check = True            # Must recheck previous sets
                iset.remove(s)          # Remove it from remaining sets
                nset.update(s)          # Add it to the current set
    result.append(tuple(nset))  # Convert back to a list of tuples

给予

我在集合中遇到了这个问题，所以我正在为此贡献我的解决方案。它尽可能长地将集合与一个更常见的元素相结合

我的示例数据：

data = [['A','B','C'],['B','C','D'],['D'],['X'],['X','Y'],['Y','Z'],['M','N','O'],['M','N','O'],['O','A']]
data = list(map(set,data))

我的代码用于解决此问题：

oldlen = len(data)+1
while len(data)<oldlen:
    oldlen = len(data)
    for i in range(len(data)):
        for j in range(i+1,len(data)):
                if len(data[i]&data[j]):
                    data[i] = data[i]|data[j]
                    data[j] = set()
    data = [data[i] for i in range(len(data)) if data[i]!= set()]

我在解决共引用时遇到了这个问题，我需要将集合合并到具有公共元素的集合列表中：

导入副本
def合并（集合列表）：
#初始状态
集合列表=copy.deepcopy（集合列表）
结果=[]
索引=查找重叠集合（集合列表）
而指数：
#保留其他组
结果=[
s
对于idx，枚举中的s（集合列表）
如果idx不在索引中
]
#附加合并集
result.append(
集合[索引[0]]的列表。并集（集合[索引[1]]的列表）
)
#更新状态
集合列表=结果
索引=查找重叠集合（集合列表）
返回集合的列表
def查找重叠集合（集合列表）：
对于i，枚举中的i_集（i_集的列表）：
对于枚举中的j，j_集（_集列表[i+1:]）：
如果i_集相交（j_集）：
返回i，i+j+1

使用图形处理库，任务变得微不足道。与此类似，您需要找到：

要以元组列表的形式获取结果，请执行以下操作：

result = list(map(tuple, nx.connected_components(G)))
print(result)
# [('d', 'e', 'c'), ('a', 'b')]

你的合并策略不明确我想合并每个有共同元素的元组：

（'c'，'e'）和（'c'，'d'）

，因为“c”是共同的，这将给我们带来

（'c'，'e'，'d'）

，然后将其与

（'d'，'e'）

，因为“d”和“e”是共同的，这将导致

（'c'，'e'，'d'））

你能从下面这个基本上回答了一个非常类似问题的例子中构建吗？@MSeifert的代码更优雅，但是对于

列表集=[{1，2}，{2，3}，{4}，{5，6}，{4，5}，{3，7}]

，输出将是

[{1，2，3，7}，{4，5}，{5，6}]

，这有点让人困惑，而这个方法给出了

[{4,5,6}，{1,2,3,7}]

。可能适合不同的用例。

[{'A', 'B', 'C', 'D', 'M', 'N', 'O'}, {'X', 'Y', 'Z'}]

import networkx as nx

tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]
graph = nx.Graph(tuple_list)
result = list(nx.connected_components(graph))
print(result)
# [{'e', 'c', 'd'}, {'b', 'a'}]

result = list(map(tuple, nx.connected_components(G)))
print(result)
# [('d', 'e', 'c'), ('a', 'b')]