Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/336.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/batch-file/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从具有边的数据帧创建稀疏矩阵_Python_Pandas_Numpy - Fatal编程技术网

Python 从具有边的数据帧创建稀疏矩阵

Python 从具有边的数据帧创建稀疏矩阵,python,pandas,numpy,Python,Pandas,Numpy,假设我有一个csv文件,其中包含以下格式的数据: A B C D A C D-F GH K M 我是 其中,每行在节点1和节点2之间给出一条无向边。我目前正在将其作为数据帧阅读,但希望将其转换为稀疏矩阵。有没有一种不循环的快速简便方法?要直接构造一个scipy稀疏矩阵,您必须将字母映射到唯一索引上,例如a==1,B==2,等等 In [202]: txt='''A B ...: ...: C D ...: ...: A C ...:

假设我有一个csv文件,其中包含以下格式的数据:

A B

C D

A C

D-F

GH

K M

我是


其中,每行在节点1和节点2之间给出一条无向边。我目前正在将其作为数据帧阅读,但希望将其转换为稀疏矩阵。有没有一种不循环的快速简便方法?

要直接构造一个scipy稀疏矩阵,您必须将字母映射到唯一索引上,例如
a==1
B==2
,等等

In [202]: txt='''A B
     ...: 
     ...: C D
     ...: 
     ...: A C
     ...: 
     ...: D F
     ...: 
     ...: G H
     ...: 
     ...: K M
     ...: 
     ...: M A'''.splitlines()
In [203]: values = 'ABCDEFGHIJKLM'
In [204]: data = [x.split() for x in txt if x]
In [205]: data = [[values.index(x) for x in row] for row in data]
In [206]: data
Out[206]: [[0, 1], [2, 3], [0, 2], [3, 5], [6, 7], [10, 12], [12, 0]]
现在我们有了坐标对。有多种方法可以从这些矩阵构造稀疏矩阵。从概念上讲,最简单的方法可能是使用
lil
格式矩阵(迭代构造的最佳格式)进行迭代:

[207]中的
:从scipy导入稀疏
在[208]中:M=sparse.lil_矩阵((len(值),len(值)),dtype=int)
In[209]:对于数据中的行:
…:M[元组(行)]=1
...:     
In[210]:M
出[210]:
在[211]中:文学硕士
出[211]:
数组([[0,1,1,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [207]: from scipy import sparse
In [208]: M = sparse.lil_matrix((len(values),len(values)),dtype=int)
In [209]: for row in data:
     ...:     M[tuple(row)] = 1
     ...:     
In [210]: M
Out[210]: 
<13x13 sparse matrix of type '<class 'numpy.int64'>'
    with 7 stored elements in LInked List format>
In [211]: M.A
Out[211]: 
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])