Python CVXPY中的最小二乘问题内存不足
我已经在谷歌集团发布了这篇文章,但希望能在这里得到更快的回复。我有一个简单的最小二乘问题,有很多观察结果。向量有200000个观测值,但我希望cvxpy只是将每个元素平方并求和Python CVXPY中的最小二乘问题内存不足,python,cvxpy,Python,Cvxpy,我已经在谷歌集团发布了这篇文章,但希望能在这里得到更快的回复。我有一个简单的最小二乘问题,有很多观察结果。向量有200000个观测值,但我希望cvxpy只是将每个元素平方并求和 import cvxpy as cp import numpy as np N = 200000 M = 100 X = np.random.normal(0., 1., size=(N, M)) cov = np.cov(X, rowvar=False) y = 2 * X[:,0] + 0.5 * X[:,1
import cvxpy as cp
import numpy as np
N = 200000
M = 100
X = np.random.normal(0., 1., size=(N, M))
cov = np.cov(X, rowvar=False)
y = 2 * X[:,0] + 0.5 * X[:,1] + np.random.normal(0., 1., size=N)
w = cp.Variable(M)
w.value = np.ones(M) / M
prob = cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w) / N))
prob.solve()
上面的代码不需要形成一个nxn的矩阵,但它确实需要(请参见堆栈跟踪的末尾),并且内存不足
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-1-fb190a76c06f> in <module>()
12 w.value = np.ones(M) / M
13 prob = cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w) / N))
---> 14 prob.solve()
/anaconda3/lib/python3.7/site-packages/cvxpy/problems/problem.py in solve(self, *args, **kwargs)
244 else:
245 solve_func = Problem._solve
--> 246 return solve_func(self, *args, **kwargs)
247
248 @classmethod
/anaconda3/lib/python3.7/site-packages/cvxpy/problems/problem.py in _solve(self, solver, ignore_dcp, warm_start, verbose, parallel, **kwargs)
356 raise e
357
--> 358 data, inverse_data = self._solving_chain.apply(self)
359 solution = self._solving_chain.solve_via_data(self, data, warm_start, verbose,
360 kwargs)
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/chain.py in apply(self, problem)
56 inverse_data = []
57 for r in self.reductions:
---> 58 problem, inv = r.apply(problem)
59 inverse_data.append(inv)
60 return problem, inverse_data
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/qp2quad_form/qp2symbolic_qp.py in apply(self, problem)
53 if not self.accepts(problem):
54 raise ValueError("Cannot reduce problem to symbolic QP")
---> 55 return Canonicalization(qp_canon_methods).apply(problem)
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py in apply(self, problem)
39
40 canon_objective, canon_constraints = self.canonicalize_tree(
---> 41 problem.objective)
42
43 for constraint in problem.constraints:
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py in canonicalize_tree(self, expr)
76 constrs = []
77 for arg in expr.args:
---> 78 canon_arg, c = self.canonicalize_tree(arg)
79 canon_args += [canon_arg]
80 constrs += c
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py in canonicalize_tree(self, expr)
76 constrs = []
77 for arg in expr.args:
---> 78 canon_arg, c = self.canonicalize_tree(arg)
79 canon_args += [canon_arg]
80 constrs += c
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py in canonicalize_tree(self, expr)
79 canon_args += [canon_arg]
80 constrs += c
---> 81 canon_expr, c = self.canonicalize_expr(expr, canon_args)
82 constrs += c
83 return canon_expr, constrs
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/canonicalization.py in canonicalize_expr(self, expr, args)
95 return Constant(expr.value), []
96 elif type(expr) in self.canon_methods:
---> 97 return self.canon_methods[type(expr)](expr, args)
98 else:
99 return expr.copy(args), []
/anaconda3/lib/python3.7/site-packages/cvxpy/reductions/qp2quad_form/atom_canonicalizers/quad_over_lin_canon.py in quad_over_lin_canon(expr, args)
27 y = args[1]
28 t = Variable(affine_expr.shape)
---> 29 return SymbolicQuadForm(t, eye(affine_expr.size)/y, expr), [affine_expr == t]
/anaconda3/lib/python3.7/site-packages/numpy/lib/twodim_base.py in eye(N, M, k, dtype, order)
199 if M is None:
200 M = N
--> 201 m = zeros((N, M), dtype=dtype, order=order)
202 if k >= M:
203 return m
MemoryError: Unable to allocate array with shape (200000, 200000) and data type float64
---------------------------------------------------------------------------
MemoryError回溯(上次最近调用)
在()
12 w.值=np.单位(M)/M
13 prob=cp.问题(cp.最小化(cp.平方和(y-X@w)/N))
--->14问题解决()
/anaconda3/lib/python3.7/site-packages/cvxpy/problems/problem.py in solve(self,*args,**kwargs)
244其他:
245 solve_func=问题。_solve
-->246返回解算函数(self、*args、**kwargs)
247
248@classmethod
/anaconda3/lib/python3.7/site-packages/cvxpy/problems/problem.py in_solve(self、solver、ignore_dcp、warm_start、verbose、parallel、**kwargs)
356升e
357
-->358数据,逆_数据=自。_求解_链。应用(自)
359 solution=self.\u solution\u chain.通过数据(self、data、warm\u start、verbose、,
360夸格)
/应用中的anaconda3/lib/python3.7/site-packages/cvxpy/reduces/chain.py(自我,问题)
56逆_数据=[]
57对于自还原中的r:
--->58问题,inv=r.apply(问题)
59反向_数据追加(inv)
60返回问题,逆_数据
/anaconda3/lib/python3.7/site-packages/cvxpy/reduces/qp2quad_form/qp2symbolic_qp.py在应用中(自我,问题)
53如果不是自我接受(问题):
54提升值错误(“无法将问题降低到符号QP”)
--->55返回规范化(qp_canon_方法)。应用(问题)
/anaconda3/lib/python3.7/site-packages/cvxpy/reduces/canonicalization.py在应用中(自我,问题)
39
40 canon_目标,canon_约束=self.canonicalize_树(
--->41.问题(目标)
42
43对于问题中的约束。约束:
/规范化树(self,expr)中的anaconda3/lib/python3.7/site-packages/cvxpy/reduces/canonicalization.py
76 constrs=[]
77对于expr.args中的arg:
--->78 canon_arg,c=自我规范化树(arg)
79佳能参数+=[佳能参数]
80收缩率+=c
/规范化树(self,expr)中的anaconda3/lib/python3.7/site-packages/cvxpy/reduces/canonicalization.py
76 constrs=[]
77对于expr.args中的arg:
--->78 canon_arg,c=自我规范化树(arg)
79佳能参数+=[佳能参数]
80收缩率+=c
/规范化树(self,expr)中的anaconda3/lib/python3.7/site-packages/cvxpy/reduces/canonicalization.py
79佳能参数+=[佳能参数]
80收缩率+=c
--->81 canon_expr,c=self.canonicalize_expr(expr,canon_args)
82 constrs+=c
83返回佳能出口,合同
/canonicalize_expr(self、expr、args)中的anaconda3/lib/python3.7/site-packages/cvxpy/reduces/canonicalization.py
95返回常数(表达式值),[]
96 self.canon_方法中的elif类型(expr):
--->97返回自身。canon_方法[类型(expr)](expr,args)
98其他:
99返回表达式副本(参数),[]
/anaconda3/lib/python3.7/site-packages/cvxpy/reduces/qp2quad\u form/atom\u规范化器/quad\u over\u lin\u canon.py in quad\u over\u lin\u canon(expr,args)
27 y=args[1]
28 t=变量(仿射表达式形状)
--->29返回符号QuadForm(t,眼睛(仿射表达式大小)/y,表达式),[affine\u expr==t]
/眼睛中的anaconda3/lib/python3.7/site-packages/numpy/lib/twodim_base.py(N,M,k,dtype,order)
199如果M为无:
200米=N
-->201 m=0((N,m),dtype=dtype,order=order)
202如果k>=M:
203返回m
MemoryError:无法分配形状为(200000、200000)且数据类型为float64的数组
你知道我做错了什么吗?从本质上说,你没有做错什么。您可以尝试使用ECOS解算器(problem.solve(solver=cp.ECOS))吗?CVXPY试图以二次规划的形式解决这个问题,并在这个过程中创建一个大型矩阵。您也可以尝试一种商业优化器,如Mosek。基准。(我在莫塞克工作)。