Performance 如何在Matlab中加速矢量方程?

Performance 如何在Matlab中加速矢量方程?,performance,matlab,regression,matrix-multiplication,Performance,Matlab,Regression,Matrix Multiplication,我使用以下代码在Matlab中使用随机梯度下降进行逻辑回归。培训+测试样本总数约为600K。代码运行数小时。我怎样才能加快速度 %% load dataset clc; clear; load('covtype.mat'); Data = [X y]; %% Split into testing and training data in a 1:9 split nRows=size(Data,1); randRows=randperm(nRows); % generate random

我使用以下代码在Matlab中使用随机梯度下降进行逻辑回归。培训+测试样本总数约为600K。代码运行数小时。我怎样才能加快速度

%% load dataset

clc;
clear;
load('covtype.mat');

Data = [X y];

%% Split into testing and training data in a 1:9 split

nRows=size(Data,1);
randRows=randperm(nRows);  % generate random ordering of row indices
Test=Data(randRows(1:58101),:);  % index using random order
Train=Data(randRows(58102:end),:);

Testx=Test(:,1:54);
Testy=Test(:,55:end);

Trainx=Train(:,1:54);
Trainy=Train(:,55:end);

%% Perform stochastic gradient descent on training data

lambda=0.01; % regularisation constant
alpha=0.01; % step length constant

theta_old = zeros(54,1);
theta_new = theta_old;
z=1;

for count = 1:size(Train,1)
    theta_old = theta_new;
    theta_new = theta_old + (alpha*Trainy(count)* (1.0 ./ (1.0 + exp(Trainy(count)*(Trainx(count,:)*theta_old)))).*Trainx(count,:))' - alpha*lambda*2*theta_old; %//'
    n = norm(theta_new);
    llr = lambda*n*n;
    count_dummy(z)=count; % dummy variable to store iteration number for plotting later
    % calculate log likelihood error for test data with current value of theta_new
    for i = 1:size(Test,1)
        llr = llr - 1.*log(1.0 + exp(-(Testy(i)*(Testx(i,:)*theta_new))));
    end
    llr_dummy(z)=llr; % dummy variable to store llr for plotting later
    z=z+1;

end
thetaopt = theta_new; % this is optimal theta


%% Plot results on testing data

plot(count_dummy, llr_dummy);

我必须在每次迭代时计算测试数据的对数似然误差来绘制它。如何加快代码的速度?

我将首先在代码部分放置
tic/toc
,并测量每个步骤所需的平均时间。另外,最好预先分配
count\u dummy
llr\u dummy
数组,而不是让它们在循环中增长,从而导致大量内存重新分配。@RafaelMonteiro,我应该如何预分配这两个数组?我是Matlab的初学者。另外,有没有一种方法可以在一行中计算llr变量(目前需要一个内部循环)?也许在Matlab中有一种我不知道的“全行求和”类型的函数?预分配很容易。以
count\u dummy
为例。每个循环增加一个值,因此Matlab必须在内存中重新分配数组。当
z=z+1
且循环从
1
变为
size(Train,1)
时,您只需将其放在循环之前:
count\u dummy=0(1,size(Train,1))。只预分配一次完整数组。至于
llr
变量,您可以尝试通过将所有
i
替换为
1:size(Test,1)
来对表达式进行矢量化,并取出
for/end
循环,看看它是否有效。谢谢!我试试看。我仍然不明白如何像你说的那样更换I's,但无论如何我都会尝试。我建议你阅读以下内容: