Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/image-processing/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
MATLAB高效直方图查找_Matlab_Image Processing_Matrix_Histogram - Fatal编程技术网

MATLAB高效直方图查找

MATLAB高效直方图查找,matlab,image-processing,matrix,histogram,Matlab,Image Processing,Matrix,Histogram,我有一个大的三维矩阵(大约1000x1000x100),其中包含的值对应于标准化高分辨率直方图中的一个容器。第三矩阵维度中的每个索引都有一个直方图(例如,示例维度为100个直方图) 最快的检查方法是什么?在2D索引中查找值的概率(即,与标准化柱状图中的一个bin相关联的值) 我现在的代码非常慢: probs = zeros(rows, cols, dims); for k = 1 : dims tmp = data(:,:,k); [h, centers] = hist(tmp,

我有一个大的三维矩阵(大约1000x1000x100),其中包含的值对应于标准化高分辨率直方图中的一个容器。第三矩阵维度中的每个索引都有一个直方图(例如,示例维度为100个直方图)

最快的检查方法是什么?在2D索引中查找值的概率(即,与标准化柱状图中的一个bin相关联的值)

我现在的代码非常慢:

probs = zeros(rows, cols, dims);
for k = 1 : dims
    tmp = data(:,:,k);
    [h, centers] = hist(tmp, 1000);
    h = h / sum(h); % Normalize the histogram
    for r = 1 : rows
        for c = 1 : cols
            % Identify bin center closest to value
            [~, idx] = min(abs(centers - data(r, c, k)));
            probs(r,c,k) = h(idx);
        end
    end
end
For循环的效率通常(尽管并不总是)低于向量化代码,而nest For循环的效率通常更差。我如何在不耗尽内存的情况下使用更少的循环来完成这项工作?我尝试了一些repmat调用来矢量化整个过程,但是用1000x1000x1000x100矩阵中断了我的MATLAB会话

注意:我只有Matlab2014a,所以虽然欢迎使用新函数的解决方案,但我仍然坚持使用

下面是一个小规模的演示示例,应该以可复制的方式运行:

rng(2); % Seed the RNG for repeatability
rows = 3;
cols = 3;
dims = 2;
data = repmat(1:3,3,1,2);
probs = zeros(rows, cols, dims);
for k = 1 : dims
    tmp = normrnd(0,1,1000,1);
    [h, centers] = hist(tmp);
    h = h / sum(h); % Normalize the histogram
    for r = 1 : rows
        for c = 1 : cols
            % Identify bin center closest to value
            [~, idx] = min(abs(centers - data(r, c, k)));
            probs(r,c,k) = h(idx);
        end
    end
end
当我运行上面的代码时,我得到了以下输出(这是合乎逻辑的,因为直方图是正常的高斯分布):


注:我在下面的答案中找到了一个有效的解决方案。

我假设您有一个包含每个三维索引的直方图中心的矩阵
中心sall
(带
dims
行),以及一个包含直方图值的类似矩阵
hAll
(带
dims
行)

中心所有
重塑为第三维和第四维,使用
bsxfun
计算差异,沿第四维最小化,并使用其索引
霍尔

[~, idx] = min(abs(bsxfun(@minus, data, reshape(centersAll,1,1,dims,[]))), [], 4);
hAllt = hAll.'; %'
probs2 = hAllt(bsxfun(@plus, idx, reshape(0:dims-1, 1,1,[])*size(hAll,2)));
检查:

%// Data
clear all
rng(2); % Seed the RNG for repeatability
rows = 3;
cols = 3;
dims = 2;
data = repmat(1:3,3,1,2);
for k = 1 : dims
    tmp = normrnd(0,1,1000,1);
    [h, centers] = hist(tmp);
    h = h / sum(h); % Normalize the histogram                   
    centersAll(k,:) = centers;
    hAll(k,:) = h;
end

%// With loops
probs = zeros(rows, cols, dims);
for k = 1 : dims
    for r = 1 : rows
        for c = 1 : cols
            % Identify bin center closest to value
            centers = centersAll(k,:);
            h = hAll(k,:);
            [~, idx] = min(abs(centers - data(r, c, k)));
            probs(r,c,k) = h(idx);
        end
    end
end

%// Without loops
[~, idx] = min(abs(bsxfun(@minus, data, reshape(centersAll,1,1,dims,[]))), [], 4);
hAllt = hAll.'; %'
probs2 = hAllt(bsxfun(@plus, idx, reshape(0:dims-1, 1,1,[])*size(hAll,2)));

%// Check
probs==probs2
给予

最佳解决方案:

在不使用新功能(即R2014b之前的所有版本)的情况下,最好的方法是同时利用该功能和该功能

有两种情况需要考虑:

  • 组合数据,然后在直方图中查找相同的数据
  • 在由不同数据形成的直方图中查找存储箱
  • 第一种情况比较简单。
    histc()
    的一个很好的特性是,它同时返回直方图和将数据放入其中的直方图的索引。在这一点上,我们应该这样做。唉!遗憾的是,我们不是。由于
    histc()
    hist()
    后面的代码对数据的存储方式不同,因此根据使用的不同,我们最终会得到两个不同的直方图。原因似乎是
    hist()
    根据严格大于的标准选择箱子,而
    hist()
    使用大于或等于的标准选择箱子。因此,等效函数调用:

    % Using histc
    binEdges = linspace(min(tmp),max(tmp),numBins+1);
    [h1, indices] = histc(data, binEdges);
    
    % Using hist
    [h2, indices] = hist(data, numBins);
    
    导致不同的直方图:
    length(h1)-length(h2)=1

    因此,为了处理这个问题,我们可以将
    h1
    最后一个仓位的值与
    h1
    第二个仓位的值相加,去掉最后一个仓位,并相应地调整索引:

    % Account for "strictly greater than" bug that results in an extra bin
    
    h1(:, numBins) = h1(:, numBins) + h1(:, end); % Combine last two bins
    indices(indices == numBins + 1) = numBins; % Adjust indices to point to right spot
    h1 = h1(:, 1:end-1); % Lop off the extra bin
    
    现在剩下的
    h1
    h2
    索引
    向量相匹配,对应于数据进入
    h1
    的位置。因此,您可以通过有效的索引而不是循环来查找概率信息

    可运行的示例代码:

    rng(2); % Seed the RNG for repeatability
    
    % Generate some data
    numBins = 6;
    data = repmat(rand(1,5), 3, 1, 2);
    [rows, cols, dims] = size(data);
    N = rows*cols;
    
    % Bin all data into a histogram, keeping track of which bin each data point
    % gets mapped to
    h = zeros(dims, numBins + 1);
    indices = zeros(dims, N);
    for k = 1 : dims
        tmp = data(:,:,k);
        tmp = tmp(:)';
        binEdges = linspace(min(tmp),max(tmp),numBins+1);
        [h(k,:), indices(k,:)] = histc(tmp, binEdges);
    end
    
    % Account for "strictly greater than" bug that results in an extra bin
    h(:, numBins) = h(:, numBins) + h(:, end); % Add count in last bin to the second-to-last bin
    indices(indices == numBins + 1) = numBins; % Adjust indices accordingly
    h = h(:,1:end-1); % Lop off the extra bin
    h = h ./ repmat(sum(h,2), 1, numBins); % Normalize all histograms
    
    % Now we can efficiently look up probabilities by indexing instead of
    % looping
    for k = 1 : dims
        probs(:, :, k) = reshape(h(sub2ind(size(h), repmat(k, 1, size(indices, 2)), indices(k,:))), rows, cols);
    end
    probs
    
    在第二种情况下,查找比较困难,因为在创建直方图的过程中,您没有跟踪bin索引的特权。但是,我们可以通过构建第二个柱状图来解决这个问题,该柱状图与第一个柱状图具有相同的柱状图,并在柱状图划分过程中跟踪索引

    对于这种方法,首先对一些直方图训练数据使用
    hist()
    计算初始直方图。您只需要存储该训练数据的最小值和最大值。有了这些信息,我们可以使用和
    histc()
    生成相同的柱状图,调整
    histc()
    给出的额外bin“bug”

    这里的关键是处理异常数据。也就是说,新数据集中的数据超出了预先计算的直方图。由于应该将频率/概率指定为0,我们只需在预先计算的直方图中添加一个额外的bin,其值为0,并将任何未绑定的新数据映射到该索引

    下面是第二种方法的一些注释的可运行代码:

    % PRE-COMPUTE A HISTOGRAM
    rng(2); % Seed the RNG for repeatability
    
    % Build some data
    numBins = 6;
    old_data = repmat(rand(1,5), 3, 1, 2);
    [rows, cols, dims] = size(old_data);
    
    % Store min and max of each histogram for reconstruction process
    min_val = min(old_data, [], 2);
    max_val = max(old_data, [], 2);
    
    % Just use hist() function while specifying number of bins this time
    % No need to track indices because we are going to be using this histogram
    % as a reference for looking up a different set of data
    h = zeros(dims, numBins);
    for k = 1 : dims
        tmp = old_data(:,:,k);
        tmp = tmp(:)';
        h(k,:) = hist(tmp, numBins);
    end
    h = h ./ repmat(sum(h, 2), 1, numBins); % Normalize histograms
    h(:, end + 1) = 0; % Map to here any data to that falls outside the pre-computed histogram
    
    % NEW DATA
    rng(3); % Seed RNG again for repeatability
    
    % Generate some new data
    new_data = repmat(rand(1,4), 4, 1, 2); % NOTE: Doesn't have to be same size
    [rows, cols, dims] = size(new_data);
    N = rows*cols;
    
    
    % Bin new data with histc() using boundaries from pre-computed histogram
    h_new = zeros(dims, numBins + 1);
    indices_new = zeros(dims, N);
    for k = 1 : dims
        tmp = new_data(:,:,k);
        tmp = tmp(:)';
    
        % Determine bins for new histogram with the same boundaries as
        % pre-computed one. This ensures that our resulting histograms are
        % identical, except for the "greater-than" bug which is accounted for
        % below.
        binEdges = linspace(min_val(k), max_val(k), numBins+1);
        [h_new(k,:), indices_new(k,:)] = histc(tmp, binEdges);
    end
    
    % Adjust for the "greater-than" bug
    % When adjusting this histogram, we are directing outliers that don't
    % fit into the pre-computed histogram to look up probabilities from that 
    % extra bin we added to the pre-computed histogram.
    h_new(:, numBins) = h_new(:, numBins) + h_new(:, end); % Add count in last bin to the second-to-last bin
    indices_new(indices_new == numBins + 1) = numBins; % Adjust indices accordingly
    indices_new(indices_new == 0) = numBins + 1; % Direct any unbinned data to the 0-probability last bin
    h_new = h_new ./ repmat(sum(h_new,2), 1, numBins + 1); % Normalize all histograms
    
    % Now we should have all of the new data binned into a histogram
    % that matches the pre-computed one. The catch is, we now have the indices
    % of the bins the new data was matched to. Thus, we can now use the same
    % efficient indexing-based look-up strategy as before to get probabilities
    % from the pre-computed histogram.
    for k = 1 : dims
        probs(:, :, k) = reshape(h(sub2ind(size(h), repmat(k, 1, size(indices_new, 2)), indices_new(k,:))), rows, cols);
    end
    probs
    

    你能举一个小例子,说明所需的输入和输出吗?用3x2表示array@LuisMendo我添加了一个复制/粘贴示例,该示例应能准确演示我所寻找的内容。记住,这些小尺寸的速度与我的问题无关。
    rng(2); % Seed the RNG for repeatability
    
    % Generate some data
    numBins = 6;
    data = repmat(rand(1,5), 3, 1, 2);
    [rows, cols, dims] = size(data);
    N = rows*cols;
    
    % Bin all data into a histogram, keeping track of which bin each data point
    % gets mapped to
    h = zeros(dims, numBins + 1);
    indices = zeros(dims, N);
    for k = 1 : dims
        tmp = data(:,:,k);
        tmp = tmp(:)';
        binEdges = linspace(min(tmp),max(tmp),numBins+1);
        [h(k,:), indices(k,:)] = histc(tmp, binEdges);
    end
    
    % Account for "strictly greater than" bug that results in an extra bin
    h(:, numBins) = h(:, numBins) + h(:, end); % Add count in last bin to the second-to-last bin
    indices(indices == numBins + 1) = numBins; % Adjust indices accordingly
    h = h(:,1:end-1); % Lop off the extra bin
    h = h ./ repmat(sum(h,2), 1, numBins); % Normalize all histograms
    
    % Now we can efficiently look up probabilities by indexing instead of
    % looping
    for k = 1 : dims
        probs(:, :, k) = reshape(h(sub2ind(size(h), repmat(k, 1, size(indices, 2)), indices(k,:))), rows, cols);
    end
    probs
    
    % PRE-COMPUTE A HISTOGRAM
    rng(2); % Seed the RNG for repeatability
    
    % Build some data
    numBins = 6;
    old_data = repmat(rand(1,5), 3, 1, 2);
    [rows, cols, dims] = size(old_data);
    
    % Store min and max of each histogram for reconstruction process
    min_val = min(old_data, [], 2);
    max_val = max(old_data, [], 2);
    
    % Just use hist() function while specifying number of bins this time
    % No need to track indices because we are going to be using this histogram
    % as a reference for looking up a different set of data
    h = zeros(dims, numBins);
    for k = 1 : dims
        tmp = old_data(:,:,k);
        tmp = tmp(:)';
        h(k,:) = hist(tmp, numBins);
    end
    h = h ./ repmat(sum(h, 2), 1, numBins); % Normalize histograms
    h(:, end + 1) = 0; % Map to here any data to that falls outside the pre-computed histogram
    
    % NEW DATA
    rng(3); % Seed RNG again for repeatability
    
    % Generate some new data
    new_data = repmat(rand(1,4), 4, 1, 2); % NOTE: Doesn't have to be same size
    [rows, cols, dims] = size(new_data);
    N = rows*cols;
    
    
    % Bin new data with histc() using boundaries from pre-computed histogram
    h_new = zeros(dims, numBins + 1);
    indices_new = zeros(dims, N);
    for k = 1 : dims
        tmp = new_data(:,:,k);
        tmp = tmp(:)';
    
        % Determine bins for new histogram with the same boundaries as
        % pre-computed one. This ensures that our resulting histograms are
        % identical, except for the "greater-than" bug which is accounted for
        % below.
        binEdges = linspace(min_val(k), max_val(k), numBins+1);
        [h_new(k,:), indices_new(k,:)] = histc(tmp, binEdges);
    end
    
    % Adjust for the "greater-than" bug
    % When adjusting this histogram, we are directing outliers that don't
    % fit into the pre-computed histogram to look up probabilities from that 
    % extra bin we added to the pre-computed histogram.
    h_new(:, numBins) = h_new(:, numBins) + h_new(:, end); % Add count in last bin to the second-to-last bin
    indices_new(indices_new == numBins + 1) = numBins; % Adjust indices accordingly
    indices_new(indices_new == 0) = numBins + 1; % Direct any unbinned data to the 0-probability last bin
    h_new = h_new ./ repmat(sum(h_new,2), 1, numBins + 1); % Normalize all histograms
    
    % Now we should have all of the new data binned into a histogram
    % that matches the pre-computed one. The catch is, we now have the indices
    % of the bins the new data was matched to. Thus, we can now use the same
    % efficient indexing-based look-up strategy as before to get probabilities
    % from the pre-computed histogram.
    for k = 1 : dims
        probs(:, :, k) = reshape(h(sub2ind(size(h), repmat(k, 1, size(indices_new, 2)), indices_new(k,:))), rows, cols);
    end
    probs