Python 3.x 访问稀疏数组中的元素_Python 3.x_Scipy_Scikit Learn_Sparse Matrix_Minimum Spanning Tree

Python 3.x 访问稀疏数组中的元素

python-3.x scikit-learn

Python 3.x 访问稀疏数组中的元素,python-3.x,scipy,scikit-learn,sparse-matrix,minimum-spanning-tree,Python 3.x,Scipy,Scikit Learn,Sparse Matrix,Minimum Spanning Tree,我试图为欧几里德最小生成树编写一个函数，我遇到的麻烦是找到K个最近邻，正如你所看到的，我调用了一个函数，返回一个稀疏数组tat，其中包含索引和到最近邻的距离，但是我无法访问元素，因为我假设： for p1,p2, w in A: do things 因为这会返回一个错误，A只返回1项（不是3项）。是否有方法访问此数据集中的每个元素，以形成以距离为权重的边？我对python非常陌生，仍在努力学习该语言的所有细节 from sklearn.neighbors import kneighb

我试图为欧几里德最小生成树编写一个函数，我遇到的麻烦是找到K个最近邻，正如你所看到的，我调用了一个函数，返回一个稀疏数组tat，其中包含索引和到最近邻的距离，但是我无法访问元素，因为我假设：

 for p1,p2, w in A:
    do things

因为这会返回一个错误，A只返回1项（不是3项）。是否有方法访问此数据集中的每个元素，以形成以距离为权重的边？我对python非常陌生，仍在努力学习该语言的所有细节

from sklearn.neighbors import kneighbors_graph
from kruskalsalgorithm import *
import networkx as nx


def EMST(inlist):

    graph = nx.Graph()

    for a,b in inlist:
        graph.add_node((a,b))

    print("nodes = ", graph.nodes())

    A = kneighbors_graph(graph.nodes(),1,mode='distance', metric='euclidean',include_self=False,n_jobs=-1)
    A.toarray()

这就是我测试功能的方式

mylist = [[2,3],[4,2],[9,4],[3,1]]
EMST(mylist)

我的输出是：

nodes = [(2, 3), (4, 2), (9, 4), (3, 1)]
(0, 1)    2.2360679775
(1, 3)    1.41421356237
(2, 1)    5.38516480713
(3, 1)    1.41421356237

你没有真正解释你到底想做什么。有很多可以想象的潜在事物

但总的来说，你应该遵循文档@。在您的例子中，sklearn的函数保证了

一种可能的用法是：

from scipy import sparse as sp
import numpy as np
np.random.seed(1)

mat = sp.random(4,4, density=0.4)
print(mat)

I, J, V = sp.find(mat)
print(I)
print(J)
print(V)

输出：

(3, 0)        0.846310916686
(1, 3)        0.313273516932
(3, 1)        0.524548159573
(2, 0)        0.44345289378
(2, 1)        0.22957721373
(2, 2)        0.534413908947
[2 3 2 3 2 1]
[0 0 1 1 2 3]
[ 0.44345289  0.84631092  0.22957721  0.52454816  0.53441391  0.31327352]

当然你可以：

for a, b, w in zip(I, J, V):
    print(a, b, w)

其中打印：

2 0 0.44345289378
3 0 0.846310916686
2 1 0.22957721373
3 1 0.524548159573
2 2 0.534413908947
1 3 0.313273516932

你没有真正解释你到底想做什么。有很多可以想象的潜在事物

但总的来说，你应该遵循文档@。在您的例子中，sklearn的函数保证了

一种可能的用法是：

from scipy import sparse as sp
import numpy as np
np.random.seed(1)

mat = sp.random(4,4, density=0.4)
print(mat)

I, J, V = sp.find(mat)
print(I)
print(J)
print(V)

输出：

(3, 0)        0.846310916686
(1, 3)        0.313273516932
(3, 1)        0.524548159573
(2, 0)        0.44345289378
(2, 1)        0.22957721373
(2, 2)        0.534413908947
[2 3 2 3 2 1]
[0 0 1 1 2 3]
[ 0.44345289  0.84631092  0.22957721  0.52454816  0.53441391  0.31327352]

当然你可以：

for a, b, w in zip(I, J, V):
    print(a, b, w)

其中打印：

2 0 0.44345289378
3 0 0.846310916686
2 1 0.22957721373
3 1 0.524548159573
2 2 0.534413908947
1 3 0.313273516932

我可以通过以下方式重新创建您的显示：

In [65]: from scipy import sparse
In [72]: row = np.array([0,1,2,3])
In [73]: col = np.array([1,3,1,1])
In [74]: data = np.array([5,2,29,2])**.5
In [75]: M = sparse.csr_matrix((data, (row, col)), shape=(4,4))
In [76]: M
Out[76]: 
<4x4 sparse matrix of type '<class 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>
In [77]: print(M)
  (0, 1)    2.23606797749979
  (1, 3)    1.4142135623730951
  (2, 1)    5.385164807134504
  (3, 1)    1.4142135623730951
In [78]: M.A   # M.toarray()
Out[78]: 
array([[0.        , 2.23606798, 0.        , 0.        ],
       [0.        , 0.        , 0.        , 1.41421356],
       [0.        , 5.38516481, 0.        , 0.        ],
       [0.        , 1.41421356, 0.        , 0.        ]])

检查点和矩阵匹配：

In [95]: pts = np.array([(2, 3), (4, 2), (9, 4), (3, 1)])
In [96]: pts
Out[96]: 
array([[2, 3],
       [4, 2],
       [9, 4],
       [3, 1]])
In [97]: for r,c,d in zip(*sparse.find(M)):
    ...:     print(((pts[r]-pts[c])**2).sum()**.5)
    ...:     
2.23606797749979
5.385164807134504
1.4142135623730951
1.4142135623730951

或者一次获得所有最近的距离：

In [107]: np.sqrt(((pts[row,:]-pts[col,:])**2).sum(1))
Out[107]: array([2.23606798, 1.41421356, 5.38516481, 1.41421356])
In [110]: np.linalg.norm(pts[row,:]-pts[col,:],axis=1)
Out[110]: array([2.23606798, 1.41421356, 5.38516481, 1.41421356])

“蛮力”最小距离计算：

所有成对距离：

In [112]: dist = np.linalg.norm(pts[None,:,:]-pts[:,None,:],axis=2)
In [113]: dist
Out[113]: 
array([[0.        , 2.23606798, 7.07106781, 2.23606798],
       [2.23606798, 0.        , 5.38516481, 1.41421356],
       [7.07106781, 5.38516481, 0.        , 6.70820393],
       [2.23606798, 1.41421356, 6.70820393, 0.        ]])

（将其与[78]输出进行比较）

在对角线上“空白”

In [114]: D = dist + np.eye(4)*100

最小距离和坐标（按行）：