Python 存储子图的非浪费方式_Python_Algorithm_Data Structures_Graph Theory

Python 存储子图的非浪费方式

python algorithm data-structures

Python 存储子图的非浪费方式,python,algorithm,data-structures,graph-theory,Python,Algorithm,Data Structures,Graph Theory,假设我有6个元素：A，B，C，D，E和F。A连接到B（或者B连接到A——没有方向的概念），B连接到C，D连接到E，F不连接到任何东西。但我真正想要的不是直接的连接，而是知道哪些元素有连接它们的路径我知道用Python编码的一种方法是使用6x6邻接矩阵。但由于它是一个相当稀疏的矩阵，这是内存浪费我知道的另一种方法是用字典。下面是它在Python中的外观 graph = { A: [B, C], B: [A, C], C: [A, B], D: [E],

假设我有6个元素：A，B，C，D，E和F。A连接到B（或者B连接到A——没有方向的概念），B连接到C，D连接到E，F不连接到任何东西。但我真正想要的不是直接的连接，而是知道哪些元素有连接它们的路径

我知道用Python编码的一种方法是使用6x6邻接矩阵。但由于它是一个相当稀疏的矩阵，这是内存浪费

我知道的另一种方法是用字典。下面是它在Python中的外观

graph = {
    A: [B, C],
    B: [A, C],
    C: [A, B],
    D: [E],
    E: [D],
    F: []
}

然而，这种结构似乎更适合跟踪直接连接，而不是连接子图。特别是，有很多浪费的内存被使用，例如

a:[B，C]

和

B:[a，C]

编码完全相同的东西

有谁能推荐一种更适合存储此信息的数据结构和/或一种比邻接矩阵或上述字典更适合创建此结构的算法吗？

Python中有一些库，如

networkx

或

igraph

Networkx

我用了很长时间的

networkx

。很好。我将在这里展示有向图和无向图之间的区别：

定向：

import matplotlib.pyplot as plt
import networkx as nx
A, B, C, D, E, F = 'A', 'B', 'C', 'D', 'E', 'F'
dictionary = {A: [B, C], B: [A, C], C: [A, B], D: [E], E: [D], F: []}
G = nx.DiGraph(dictionary)
nx.draw(G, with_labels=True)
plt.show()

import matplotlib.pyplot as plt
import networkx as nx
A, B, C, D, E, F = 'A', 'B', 'C', 'D', 'E', 'F'
dictionary = {A: [B], B: [C], C: [A], D: [E], F: []}
G = nx.Graph(dictionary)
nx.draw(G, with_labels=True)
plt.show()

可以通过如下方式访问连接的组件：

>>> print([list(n) for n in nx.strongly_connected_components(G)])
[['B', 'A', 'C'], ['D', 'E'], ['F']]

>>> print([list(n) for n in nx.connected_components(G)])
[['B', 'A', 'C'], ['E', 'D'], ['F']]

无方向：

import matplotlib.pyplot as plt
import networkx as nx
A, B, C, D, E, F = 'A', 'B', 'C', 'D', 'E', 'F'
dictionary = {A: [B, C], B: [A, C], C: [A, B], D: [E], E: [D], F: []}
G = nx.DiGraph(dictionary)
nx.draw(G, with_labels=True)
plt.show()

import matplotlib.pyplot as plt
import networkx as nx
A, B, C, D, E, F = 'A', 'B', 'C', 'D', 'E', 'F'
dictionary = {A: [B], B: [C], C: [A], D: [E], F: []}
G = nx.Graph(dictionary)
nx.draw(G, with_labels=True)
plt.show()

可以通过如下方式访问连接的组件：

>>> print([list(n) for n in nx.strongly_connected_components(G)])
[['B', 'A', 'C'], ['D', 'E'], ['F']]

>>> print([list(n) for n in nx.connected_components(G)])
[['B', 'A', 'C'], ['E', 'D'], ['F']]

您可以为此使用集合

Set1: {A, B, C},
Set2: {D, E},
Set3: {F}

如果您需要快速从一个元素转到它的集合，请使用字典。

您可以使用。以下是可能的实施：

# Implementation of Union-Find (Disjoint Set)
class Node:
    def __init__(self, value):
        self.value = value
        self.values = [value]
        self.parent = self
        self.rank = 0

    def find(self):
        if self.parent.parent != self.parent:
            self.parent = self.parent.find()
        return self.parent

    def union(self, other):
        node = self.find()
        other = other.find()
        if node == other:
            return True # was already in same set
        if node.rank > other.rank:
            node, other = other, node
        node.parent = other
        other.rank = max(other.rank, node.rank + 1)
        other.values.extend(node.values)
        node.values = None # Discard
        return False # was not in same set, but now is

nodes = "ABCDEF"
edges = ["AB", "AC", "DE"]

# create Node instances
nodes = {node: Node(node) for node in nodes}
# process the edges
for a, b in edges:
    nodes[a].union(nodes[b])

# now do a query
print("group with 'B' {}".format(nodes["B"].find().values))

您可以将关系表示为字符集，并使用Rosetta代码任务“集合合并”的Python解决方案计算连接字符集

如果您严格地寻找非有向图的有效存储，那么可以使用具有两个约定的字符串。首先，节点名称对（边）总是按字母（或字母数字）降序表示。第二，您不会重复该配对的左侧，只在较大的对应节点之后列出相关节点

这会产生这样的字符串：“B，a | C，a，B | E，D | F”

您可以在该字符串编码和内存中完全扩展的字典表示之间来回切换（这对于任何有意义的操作都是必要的）：

从字符串到字典：

sGraph = "B,A|C,A,B|E,D|F"
dGraph = { v:[] for v in sGraph.replace("|",",").split(",") }
dGraph.update({ v[0]:v[1:] for vG in sGraph.split("|") for v in[vG.split(",")]})
for v1,v2 in [(v1,v2) for v1,vN in dGraph.items() for v2 in vN]:dGraph[v2].append(v1)

输出：

print(dGraph) 
# {'B': ['A', 'C'], 'A': ['B', 'C'], 'C': ['A', 'B'], 'E': ['D'], 'D': ['E'], 'F': []}

注意：根据您的处理需要，最后一个for循环（如上）可以省略。这将为您提供一个完整的图形表示，边上没有冗余（但这将使其更难操作）

从字典到字符串（基于完全扩展的表示形式）：

字符串的长度将始终小于2*（E+V），其中E是边数，V是顶点数（假设每个顶点名称有一个字母/字符）。

a如何？我现在正在研究它。谢谢你的建议！当你说没有连接的概念时，我假设你在处理一个无向图。一个无向图可以用邻接列表来表示，正如您在上面所表示的那样。矩阵表示可能是稀疏的，对于稀疏节点来说效率很低。如果您希望提高内存效率，则不相交集可能是一个更好的选择，但当您将其转换为DJ时，可能会丢失边缘信息。这说明了如何计算它们，但问题是如何存储它们。我使用至少两倍简化的图字典：

字典={a:[B]，B:[C]，C:[a]，D:[E]，F:[]}

。