Performance 如何在Matlab中加速矢量方程?
我使用以下代码在Matlab中使用随机梯度下降进行逻辑回归。培训+测试样本总数约为600K。代码运行数小时。我怎样才能加快速度Performance 如何在Matlab中加速矢量方程?,performance,matlab,regression,matrix-multiplication,Performance,Matlab,Regression,Matrix Multiplication,我使用以下代码在Matlab中使用随机梯度下降进行逻辑回归。培训+测试样本总数约为600K。代码运行数小时。我怎样才能加快速度 %% load dataset clc; clear; load('covtype.mat'); Data = [X y]; %% Split into testing and training data in a 1:9 split nRows=size(Data,1); randRows=randperm(nRows); % generate random
%% load dataset
clc;
clear;
load('covtype.mat');
Data = [X y];
%% Split into testing and training data in a 1:9 split
nRows=size(Data,1);
randRows=randperm(nRows); % generate random ordering of row indices
Test=Data(randRows(1:58101),:); % index using random order
Train=Data(randRows(58102:end),:);
Testx=Test(:,1:54);
Testy=Test(:,55:end);
Trainx=Train(:,1:54);
Trainy=Train(:,55:end);
%% Perform stochastic gradient descent on training data
lambda=0.01; % regularisation constant
alpha=0.01; % step length constant
theta_old = zeros(54,1);
theta_new = theta_old;
z=1;
for count = 1:size(Train,1)
theta_old = theta_new;
theta_new = theta_old + (alpha*Trainy(count)* (1.0 ./ (1.0 + exp(Trainy(count)*(Trainx(count,:)*theta_old)))).*Trainx(count,:))' - alpha*lambda*2*theta_old; %//'
n = norm(theta_new);
llr = lambda*n*n;
count_dummy(z)=count; % dummy variable to store iteration number for plotting later
% calculate log likelihood error for test data with current value of theta_new
for i = 1:size(Test,1)
llr = llr - 1.*log(1.0 + exp(-(Testy(i)*(Testx(i,:)*theta_new))));
end
llr_dummy(z)=llr; % dummy variable to store llr for plotting later
z=z+1;
end
thetaopt = theta_new; % this is optimal theta
%% Plot results on testing data
plot(count_dummy, llr_dummy);
我必须在每次迭代时计算测试数据的对数似然误差来绘制它。如何加快代码的速度?我将首先在代码部分放置
tic/toc
,并测量每个步骤所需的平均时间。另外,最好预先分配count\u dummy
和llr\u dummy
数组,而不是让它们在循环中增长,从而导致大量内存重新分配。@RafaelMonteiro,我应该如何预分配这两个数组?我是Matlab的初学者。另外,有没有一种方法可以在一行中计算llr变量(目前需要一个内部循环)?也许在Matlab中有一种我不知道的“全行求和”类型的函数?预分配很容易。以count\u dummy
为例。每个循环增加一个值,因此Matlab必须在内存中重新分配数组。当z=z+1
且循环从1
变为size(Train,1)
时,您只需将其放在循环之前:count\u dummy=0(1,size(Train,1))代码>。只预分配一次完整数组。至于llr
变量,您可以尝试通过将所有i
替换为1:size(Test,1)
来对表达式进行矢量化,并取出for/end
循环,看看它是否有效。谢谢!我试试看。我仍然不明白如何像你说的那样更换I's,但无论如何我都会尝试。我建议你阅读以下内容: