Python 不分配密集阵列的快速稀疏矩阵乘法
我有一个m x m稀疏矩阵Python 不分配密集阵列的快速稀疏矩阵乘法,python,performance,numpy,scipy,sparse-matrix,Python,Performance,Numpy,Scipy,Sparse Matrix,我有一个m x m稀疏矩阵相似性和一个包含m个元素的向量,组合_比例。我希望将相似性中的第I列乘以组合比例[I]。这是我的第一次尝试: for i in range(m): scale = combined_scales[i] similarities[:, i] *= scale 这在语义上是正确的,但性能很差,因此我尝试将其更改为: # sparse.diags creates a diagonal matrix. # docs: https://docs.scipy.or
相似性
和一个包含m个元素的向量,组合_比例
。我希望将相似性中的第I列乘以组合比例[I]
。这是我的第一次尝试:
for i in range(m):
scale = combined_scales[i]
similarities[:, i] *= scale
这在语义上是正确的,但性能很差,因此我尝试将其更改为:
# sparse.diags creates a diagonal matrix.
# docs: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.diags.html
similarities *= sparse.diags(combined_scales)
但是当我运行这行代码时,我立即得到了一个内存错误。奇怪的是,scipy似乎正试图在这里分配一个密集的numpy数组:
Traceback (most recent call last):
File "main.py", line 108, in <module>
loop.run_until_complete(main())
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 466, in run_until_complete
return future.result()
File "main.py", line 100, in main
magic.fit(df)
File "C:\cygwin64\home\james\code\py\relativity\ml.py", line 127, in fit
self._scale_similarities(X, net_similarities)
File "C:\cygwin64\home\james\code\py\relativity\ml.py", line 148, in _scale_similarities
similarities *= sparse.diags(combined_scales)
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\scipy\sparse\base.py", line 440, in __mul__
return self._mul_sparse_matrix(other)
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\scipy\sparse\compressed.py", line 503, in _mul_sparse_matrix
data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))
MemoryError
回溯(最近一次呼叫最后一次):
文件“main.py”,第108行,在
循环。运行\u直到完成(main())
文件“C:\Users\james\AppData\Local\Programs\Python36-32\lib\asyncio\base\u events.py”,第466行,运行\u直到完成
返回future.result()
文件“main.py”,第100行,在main中
magic.fit(df)
文件“C:\cygwin64\home\james\code\py\relativity\ml.py”,第127行,适合
自相似性(X,净相似性)
文件“C:\cygwin64\home\james\code\py\relativity\ml.py”,第148行,按比例
相似性*=稀疏图(组合_标度)
文件“C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site packages\scipy\sparse\base.py”,第440行,在__
返回self.\u mul\u稀疏矩阵(其他)
文件“C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site packages\scipy\sparse\compressed.py”,第503行,在矩阵中
data=np.empty(nnz,dtype=upcast(self.dtype,other.dtype))
记忆者
如何防止它在此处分配密集阵列?谢谢。来自sparse.compressed
class _cs_matrix # common for csr and csc
def _mul_sparse_matrix(self, other):
M, K1 = self.shape
K2, N = other.shape
major_axis = self._swap((M,N))[0]
other = self.__class__(other) # convert to this format
idx_dtype = get_index_dtype((self.indptr, self.indices,
other.indptr, other.indices),
maxval=M*N)
indptr = np.empty(major_axis + 1, dtype=idx_dtype)
fn = getattr(_sparsetools, self.format + '_matmat_pass1')
fn(M, N,
np.asarray(self.indptr, dtype=idx_dtype),
np.asarray(self.indices, dtype=idx_dtype),
np.asarray(other.indptr, dtype=idx_dtype),
np.asarray(other.indices, dtype=idx_dtype),
indptr)
nnz = indptr[-1]
idx_dtype = get_index_dtype((self.indptr, self.indices,
other.indptr, other.indices),
maxval=nnz)
indptr = np.asarray(indptr, dtype=idx_dtype)
indices = np.empty(nnz, dtype=idx_dtype)
data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))
fn = getattr(_sparsetools, self.format + '_matmat_pass2')
fn(M, N, np.asarray(self.indptr, dtype=idx_dtype),
np.asarray(self.indices, dtype=idx_dtype),
self.data,
np.asarray(other.indptr, dtype=idx_dtype),
np.asarray(other.indices, dtype=idx_dtype),
other.data,
indptr, indices, data)
return self.__class__((data,indices,indptr),shape=(M,N))
相似性
是一个稀疏的csr矩阵other
,diag
矩阵也已在中转换为csr
other = self.__class__(other)
csr\u matmat\u pass1
(编译代码)使用来自self
和other
的索引运行,返回nnz
,即输出中非零项的数量
然后分配indptr
、索引
和数据
数组,这些数组将保存csr\u matmatmat\u pass2
的结果。这些用于创建回报矩阵
self.__class__((data,indices,indptr),shape=(M,N))
创建数据数组时出错:
data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))
返回的结果只是内存中有太多的非零值
什么是m
,以及相似性.nnz
是否有足够的内存来执行类似操作。copy()
当您使用相似性*=…
时,它首先必须执行相似性*其他
。然后,结果将替换self
。它不尝试进行就地乘法
列内迭代
关于按行(或列)进行更快的迭代,寻求排序或获得最大的行值,已经有很多问题。直接使用csr
属性可以大大加快速度。我认为这个想法适用于这里
例如:
In [275]: A = sparse.random(10,10,.2,'csc').astype(int)
In [276]: A.data[:] = np.arange(1,21)
In [277]: A.A
Out[277]:
array([[ 0, 0, 4, 0, 0, 0, 0, 0, 0, 0],
[ 0, 3, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 0, 0, 0, 0, 10, 0, 0, 16, 18],
[ 0, 0, 0, 0, 0, 11, 14, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 8, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 9, 12, 0, 0, 17, 0],
[ 2, 0, 0, 0, 0, 13, 0, 0, 0, 0],
[ 0, 0, 5, 7, 0, 0, 0, 15, 0, 19],
[ 0, 0, 6, 0, 0, 0, 0, 0, 0, 20]])
In [280]: B = sparse.diags(np.arange(1,11),dtype=int)
In [281]: B
Out[281]:
<10x10 sparse matrix of type '<class 'numpy.int64'>'
with 10 stored elements (1 diagonals) in DIAgonal format>
In [282]: (A*B).A
Out[282]:
array([[ 0, 0, 12, 0, 0, 0, 0, 0, 0, 0],
[ 0, 6, 0, 0, 0, 0, 0, 0, 0, 0],
[ 1, 0, 0, 0, 0, 60, 0, 0, 144, 180],
[ 0, 0, 0, 0, 0, 66, 98, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 40, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 45, 72, 0, 0, 153, 0],
[ 2, 0, 0, 0, 0, 78, 0, 0, 0, 0],
[ 0, 0, 15, 28, 0, 0, 0, 120, 0, 190],
[ 0, 0, 18, 0, 0, 0, 0, 0, 0, 200]], dtype=int64)
时间比较:
In [287]: %%timeit A1=A.copy()
...: A1 *= B
...:
375 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [288]: %%timeit A1 = A.copy()
...: for i,j,v in zip(A1.indptr[:-1],A1.indptr[1:],np.arange(1,11)):
...: A1.data[i:j] *= v
...:
79.9 µs ± 1.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
您确定要分配密集阵列吗?运行相似性。计算\u nonzero()
并告诉我们它返回什么。实际上,回溯的最后一行表明它试图分配一个稀疏的结果,nnz
代表“非零的数量”。相似性是什么格式?
In [287]: %%timeit A1=A.copy()
...: A1 *= B
...:
375 µs ± 1.29 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [288]: %%timeit A1 = A.copy()
...: for i,j,v in zip(A1.indptr[:-1],A1.indptr[1:],np.arange(1,11)):
...: A1.data[i:j] *= v
...:
79.9 µs ± 1.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)