Python 从数据帧生成边缘列表_Python_Pandas_Graph Theory

Python 从数据帧生成边缘列表

python pandas

Python 从数据帧生成边缘列表,python,pandas,graph-theory,Python,Pandas,Graph Theory,假设我有一个熊猫数据帧，如下所示： Fruit_1 Fruit_2 Fruit_3 0 Apple Orange Peach 1 Apple Lemon Lime 2 Starfruit Apple Orange 可复制形式： df = pd.DataFrame([['Apple', 'Orange', 'Peach'], ['Apple', 'Lemon', 'Lime'],

假设我有一个熊猫数据帧，如下所示：

    Fruit_1   Fruit_2  Fruit_3 
0   Apple     Orange   Peach 
1   Apple     Lemon    Lime
2   Starfruit Apple    Orange

可复制形式：

df = pd.DataFrame([['Apple', 'Orange', 'Peach'],
                   ['Apple', 'Lemon', 'Lime'],
                   ['Starfruit', 'Apple', 'Orange']],
                  columns=['Fruit_1', 'Fruit_2', 'Fruit_3'])

我想生成一个边缘列表，其中包括：

Apple, Orange
Apple, Peach
Orange, Peach
Apple, Lemon
Apple, Lime
Lemon, Lime
Starfruit, Apple
Starfruit, Orange
Apple, Orange

我如何在Python中实现它？

我不知道，但您可以在行上使用

itertools.compositions

itertools.combinations(row, 2)

这将创建一个迭代器，您可以简单地将其转换为成对列表

在将这些列表收集到一个列表中之后，可以使用平面列表来连接这些列表

[pair for row in collected_rows for pair in row]

或者使用通常更快的

numpy

方式

data[:, np.c_[np.tril_indices(data.shape[1], -1)]]

如果你想要一个简单的列表

data[:, np.c_[np.triu_indices(data.shape[1], 1)]].reshape(-1,2)

请注意，

triu_索引

按顺序列出顶点，而

tril_索引

按相反顺序列出顶点。它们通常用于获取矩阵的上三角形或下三角形的索引。

以下是一个解决方案：

In [118]: from itertools import combinations

In [119]: df.apply(lambda x: list(combinations(x, 2)), 1).stack().reset_index(level=[0,1], drop=True).apply(', '.join)
Out[119]:
0        Apple, Orange
1         Apple, Peach
2        Orange, Peach
3         Apple, Lemon
4          Apple, Lime
5          Lemon, Lime
6     Starfruit, Apple
7    Starfruit, Orange
8        Apple, Orange
dtype: object

很好的解决方案@Katietrung，很高兴我能帮上忙。请考虑最有用的答案——这也表明你的问题已经得到解答。