Python 将csr_矩阵中的几列归零

Python 将csr_矩阵中的几列归零,python,scipy,sparse-matrix,Python,Scipy,Sparse Matrix,假设我有一个稀疏矩阵: >>> indptr = np.array([0, 2, 3, 6]) >>> indices = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 2],

假设我有一个稀疏矩阵:

>>> indptr = np.array([0, 2, 3, 6])
>>> indices = np.array([0, 2, 2, 0, 1, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray()
array([[1, 0, 2],
       [0, 0, 3],
       [4, 5, 6]])
我想把第0列和第2列归零。以下是我想要得到的:

array([[0, 0, 0],
       [0, 0, 0],
       [0, 5, 0]])
以下是我尝试过的:

sp_mat = csr_matrix((data, indices, indptr), shape=(3, 3))
zero_cols = np.array([0, 2])
sp_mat[:, zero_cols] = 0
然而,我得到一个警告:

SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
由于我使用的sp_矩阵很大,转换为lil_矩阵的速度非常慢

什么是有效的方法

In [87]: >>> indptr = np.array([0, 2, 3, 6])
    ...: >>> indices = np.array([0, 2, 2, 0, 1, 2])
    ...: >>> data = np.array([1, 2, 3, 4, 5, 6])
    ...: M = sparse.csr_matrix((data, indices, indptr), shape=(3, 3))
In [88]: M
Out[88]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 6 stored elements in Compressed Sparse Row format>
它不仅给出了警告,而且实际上增加了“稀疏”项的数量,尽管现在大多数项的值为0。只有在我们清理时,才会将其移除:

In [93]: M.eliminate_zeros()
In [94]: M
Out[94]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 1 stored elements in Compressed Sparse Row format>

如果将矩阵乘以适当的位置,则不会得到效率警告。它只改变了现有非零项的值,因此不改变矩阵的稀疏性(至少在消除零之前):

[111]中的
M=sparse.csr_矩阵((数据,索引,indptr),形状=(3,3))
在[112]中:M[:,[0,2]]*=0
In[113]:M
出[113]:
在[114]中:M.消除零()
In[115]:M
出[115]:

有什么有效的方法吗?
In [93]: M.eliminate_zeros()
In [94]: M
Out[94]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 1 stored elements in Compressed Sparse Row format>
In [103]: M
Out[103]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 6 stored elements in Compressed Sparse Row format>
In [104]: D = sparse.diags([0,1,0], dtype=M.dtype)
In [105]: D
Out[105]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 3 stored elements (1 diagonals) in DIAgonal format>
In [106]: D.A
Out[106]: 
array([[0, 0, 0],
       [0, 1, 0],
       [0, 0, 0]])
In [107]: M1 = M*D
In [108]: M1
Out[108]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 1 stored elements in Compressed Sparse Row format>
In [110]: M1.A
Out[110]: 
array([[0, 0, 0],
       [0, 0, 0],
       [0, 5, 0]], dtype=int64)
In [111]: M = sparse.csr_matrix((data, indices, indptr), shape=(3, 3))
In [112]: M[:,[0,2]] *= 0
In [113]: M
Out[113]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 6 stored elements in Compressed Sparse Row format>
In [114]: M.eliminate_zeros()
In [115]: M
Out[115]: 
<3x3 sparse matrix of type '<class 'numpy.int64'>'
    with 1 stored elements in Compressed Sparse Row format>