Python 垃圾邮件。垃圾邮件lassoWeighted错误输出?

Python 垃圾邮件。垃圾邮件lassoWeighted错误输出?,python,optimization,sparse-matrix,least-squares,Python,Optimization,Sparse Matrix,Least Squares,大家晚上好。我无法理解Spams.lassowighted函数的输出。如果您在他们的页面上运行示例 : 您将得到一个矩阵a 64x1作为输出,该矩阵只包含一个非零元素。这对于每种情况都是一样的,它每次只给每个信号一个非零元素。我不明白为什么解在| | x−Dα| | 2+λ| | diag(w)α| | 1。将是只有一个非零元素的a 输出矩阵alpha必须有10000列,因为X是64x10000,而字典是64x256(因为Da=X)。所以alpha应该是256x10000。查看Inria Spa

大家晚上好。我无法理解Spams.lassowighted函数的输出。如果您在他们的页面上运行示例 :


您将得到一个矩阵a 64x1作为输出,该矩阵只包含一个非零元素。这对于每种情况都是一样的,它每次只给每个信号一个非零元素。我不明白为什么解在| | x−Dα| | 2+λ| | diag(w)α| | 1。将是只有一个非零元素的a

输出矩阵
alpha
必须有10000列,因为X是64x10000,而字典是64x256(因为Da=X)。所以
alpha
应该是256x10000。查看Inria Spams文档,其输出为:

参数
lambda1
确定非零的数量,因为它与l1正则化器相乘。它们的实现还具有参数
L
,这是每个稀疏向量的最大非零数

因此,如果我运行以下命令:

import spams
import numpy as np
import time

np.random.seed(0)
print "test lasso weighted"
X = np.asfortranarray(np.random.normal(size=(64,10000)))
X = np.asfortranarray(X / np.tile(np.sqrt((X*X).sum(axis=0)),(X.shape[0],1)),dtype=float)
D = np.asfortranarray(np.random.normal(size=(64,256)))
D = np.asfortranarray(D / np.tile(np.sqrt((D*D).sum(axis=0)),(D.shape[0],1)),dtype=float)
param = { 'L' : 20,
    'lambda1' : 0.15, 'numThreads' : 8, 'mode' : spams.PENALTY}
W = np.asfortranarray(np.random.random(size = (D.shape[1],X.shape[1])),dtype=float)
tic = time.time()
alpha = spams.lassoWeighted(X,D,W,**param)
tac = time.time()
t = tac - tic
non_zero = []
for col in alpha.T:
    non_zero.append(col.nnz)
print 'Shape Output Matrix:', alpha.shape
print 'Min non-zeros of %d columns: %d'%(alpha.shape[1], np.min(non_zero)) 
print 'Max non-zeros of %d columns: %d'%(alpha.shape[1], np.max(non_zero)) 
print "%f signals processed per second\n" %(float(X.shape[1]) / t)
我得到:

test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 20
Max non-zeros of 10000 columns: 20
7691.130169 signals processed per second
所以10000个稀疏近似,实际上是256x1向量,每个都有20个非零

如果我们将
params
更改为(最多5个非零):

输出:

test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 5
Max non-zeros of 10000 columns: 5
26600.540090 signals processed per second
test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 40
Max non-zeros of 10000 columns: 61
1697.975321 signals processed per second
如果您想要更密集的稀疏近似值(alpha列),您可以将
L
变大或将其全部删除:

param = { 'lambda1' : 0.15, 'numThreads' : 8, 'mode' : spams.PENALTY}
输出:

test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 5
Max non-zeros of 10000 columns: 5
26600.540090 signals processed per second
test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 40
Max non-zeros of 10000 columns: 61
1697.975321 signals processed per second
test lasso weighted
Shape Output Matrix: (256, 10000)
Min non-zeros of 10000 columns: 40
Max non-zeros of 10000 columns: 61
1697.975321 signals processed per second