Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/algorithm/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Algorithm 矢量化:朋友还是敌人?bsxfun/arrayfun避免循环、重复、置换、挤压等_Algorithm_Matlab_Loops_Vectorization_Bsxfun - Fatal编程技术网

Algorithm 矢量化:朋友还是敌人?bsxfun/arrayfun避免循环、重复、置换、挤压等

Algorithm 矢量化:朋友还是敌人?bsxfun/arrayfun避免循环、重复、置换、挤压等,algorithm,matlab,loops,vectorization,bsxfun,Algorithm,Matlab,Loops,Vectorization,Bsxfun,这个问题与此相关,也可能与此相关 假设有两个矩阵A和B。A是M-by-N,B是N-by-K。我想得到一个M-by-K矩阵C,这样C(I,j)=1-prod(1-A(I,:)'.*B(:,j))。我在Matlab中尝试了一些解决方案——我在这里比较它们的计算性能 % Size of matrices: M = 4e3; N = 5e2; K = 5e1; GG = 50; % GG instances rntm1 = zeros(GG, 1); % running time of f

这个问题与此相关,也可能与此相关

假设有两个矩阵A和B。A是M-by-N,B是N-by-K。我想得到一个M-by-K矩阵C,这样
C(I,j)=1-prod(1-A(I,:)'.*B(:,j))
。我在Matlab中尝试了一些解决方案——我在这里比较它们的计算性能

% Size of matrices:
M = 4e3;
N = 5e2;
K = 5e1;

GG = 50;    % GG instances
rntm1 = zeros(GG, 1);    % running time of first algorithm
rntm2 = zeros(GG, 1);    % running time of second algorithm
rntm3 = zeros(GG, 1);    % running time of third algorithm
rntm4 = zeros(GG, 1);    % running time of fourth algorithm
rntm5 = zeros(GG, 1);    % running time of fifth algorithm
for gg = 1:GG

    A = rand(M, N);    % M-by-N matrix of random numbers
    A = A ./ repmat(sum(A, 2), 1, N);    % M-by-N matrix of probabilities (?)
    B = rand(N, K);    % N-by-K matrix of random numbers
    B = B ./ repmat(sum(B), N, 1);    % N-by-K matrix of probabilities (?)

    %% First solution
    % One-liner solution:
    tic
    C = squeeze(1 - prod(1 - repmat(A, [1 1 K]) .* permute(repmat(B, [1 1 M]), [3 1 2]), 2));
    rntm1(gg) = toc;


    %% Second solution
    % Full vectorization, using meshgrid, arrayfun and reshape (from Luis Mendo, second link above)
    tic
    [ii jj] = meshgrid(1:size(A, 1), 1:size(B, 2));
    D = arrayfun(@(n) 1 - prod(1 - A(ii(n), :)' .* B(:, jj(n))), 1:numel(ii));
    D = reshape(D, size(B, 2), size(A, 1)).';
    rntm2(gg) = toc;
    clear ii jj

    %% Third solution
    % Partial vectorization 1
    tic
    E = zeros(M, K);
    for hh = 1:M
      tmp = repmat(A(hh, :)', 1, K);
      E(hh, :) = 1 - prod((1 - tmp .* B), 1);
    end
    rntm3(gg) = toc;
    clear tmp hh

    %% Fourth solution
    % Partial vectorization 2
    tic
    F = zeros(M, K);
    for hh = 1:M
      for ii = 1:K
        F(hh, ii) = 1 - prod(1 - A(hh, :)' .* B(:, ii));
      end
    end
    rntm4(gg) = toc;
    clear hh ii

    %% Fifth solution
    % No vectorization at all
    tic
    G = ones(M, K);
    for hh = 1:M
      for ii = 1:K
        for jj = 1:N
          G(hh, ii) = G(hh, ii) * prod(1 - A(hh, jj) .* B(jj, ii));
        end
        G(hh, ii) = 1 - G(hh, ii);
      end
    end
    rntm5(gg) = toc;
    clear hh ii jj C D E F G

end

prctile([rntm1 rntm2 rntm3 rntm4 rntm5], [2.5 25 50 75 97.5])
%    3.6519    3.5261    0.5912    1.9508    2.7576
%    5.3449    6.8688    1.1973    3.3744    3.9940
%    8.1094    8.7016    1.4116    4.9678    7.0312
%    8.8124   10.5170    1.9874    6.1656    8.8227
%    9.5881   12.0150    2.1529    6.6445    9.5115

mean([rntm1 rntm2 rntm3 rntm4 rntm5])
%    7.2420    8.3068    1.4522    4.5865    6.4423

std([rntm1 rntm2 rntm3 rntm4 rntm5])
%    2.1070    2.5868    0.5261    1.6122    2.4900
解决方案是等效的,但部分矢量化的算法在内存和执行时间方面更有效。即使是三环似乎也比arrayfun表现得更好!有什么方法比第三种,仅部分矢量化的解决方案更好吗

编辑:丹的解决方案是迄今为止最好的。让rntm6、rntm7和rntm8作为其第一个、第二个和第三个解决方案的运行时。然后:

prctile(rntm6, [2.5 25 50 75 97.5])
%    0.6337    0.6377    0.6480    0.7110    1.2932
mean(rntm6)
%    0.7440
std(rntm6)
%    0.1970

prctile(rntm7, [2.5 25 50 75 97.5])
%    0.6898    0.7130    0.9050    1.1505    1.4041
mean(rntm7)
%    0.9313
std(rntm7)
%    0.2276

prctile(rntm8, [2.5 25 50 75 97.5])
%    0.5949    0.6005    0.6036    0.6370    1.3529
mean(rntm8)
%    0.6753
std(rntm8)
%    0.1890

使用
bsxfun
,您可以获得较小的性能提升:

E = zeros(M, K);
for hh = 1:M
  E(hh, :) = 1 - prod((1 - bsxfun(@times, A(hh,:)', B)), 1);
end
您可以通过以下方式压缩(双关语)一点点性能:

E = squeeze(1 - prod((1-bsxfun(@times, permute(B, [3 1 2]), A)),2));
或者你可以尝试为我的第一个建议预先计算转置:

E = zeros(M, K);
At = A';
for hh = 1:M
  E(hh, :) = 1 - prod((1 - bsxfun(@times, At(:,hh), B)), 1);
end

使用
arrayfun
bsxfun
绝对有益的一种情况是,您可以使用并行计算工具箱和兼容的NVIDIA GPU。在这种情况下,这两个函数的性能非常快,因为主体可以发送到GPU在那里执行。请参阅示例:

A注意:我会厌倦调用
arrayfun
一种完全矢量化的方法,在内部它也只是循环。现在Matlab中的循环实际上相当有效。通常arrayfun只会增加额外费用——事实上并不是很小——请参见上面的编辑:平均值减半,标准偏差减半以上!比第一个稍微慢一点,但仍然比其他的(基于50个实例)快。哦,在我的测试中它更快。好的,我将再添加一个小的调整,您可以试着注意,当不重新排序非单例维度时,简单的
重塑
通常比
shiftim
permute
更快。否则,我认为这是正确的解决方案。@randomatlabuser我会猜像changinge
permute(B[3 1 2])
[x,y]=size(B);重塑(B[1,x,y])