Arrays MATLAB中的分裂向量_Arrays_Matlab_Vector

Arrays MATLAB中的分裂向量

arrays matlab vector

Arrays MATLAB中的分裂向量,arrays,matlab,vector,Arrays,Matlab,Vector,我正在尝试优雅地分割一个向量。比如说, vec = [1 2 3 4 5 6 7 8 9 10] 根据长度相同的0和1的另一个向量，其中1表示向量应拆分或更确切地说切割的位置： cut = [0 0 0 1 0 0 0 0 1 0] 为我们提供类似于以下内容的单元输出： [1 2 3] [5 6 7 8] [10] 不幸的是，MATLAB中没有“反向连接”。如果你想解决这样的问题，你可以试试下面的代码。如果有两个分割点在末端生成三个向量，它将为您提供所需的信息。如果需要更多拆分，则需要在循

我正在尝试优雅地分割一个向量。比如说,

vec = [1 2 3 4 5 6 7 8 9 10]

根据长度相同的0和1的另一个向量，其中1表示向量应拆分或更确切地说切割的位置：

cut = [0 0 0 1 0 0 0 0 1 0]

为我们提供类似于以下内容的单元输出：

[1 2 3] [5 6 7 8] [10]

不幸的是，MATLAB中没有“反向连接”。如果你想解决这样的问题，你可以试试下面的代码。如果有两个分割点在末端生成三个向量，它将为您提供所需的信息。如果需要更多拆分，则需要在循环后修改代码

结果是n向量形式。要将它们放入单元格，请在结果上使用num2cell

pos_of_one = 0;

% The loop finds the split points and puts their positions into a vector.
for kk = 1 : length(cut)
    if cut(1,kk) == 1
        pos_of_one = pos_of_one + 1;
        A(1,one_pos) = kk;
    end
end

F = vec(1 : A(1,1) - 1);
G = vec(A(1,1) + 1 : A(1,2) - 1);
H = vec(A(1,2) + 1 : end);

以下是您需要的：

function spl  = Splitting(vec,cut)
n=1;
j=1;
for i=1:1:length(b)
    if cut(i)==0 
        spl{n}(j)=vec(i);
        j=j+1;
    else 
        n=n+1;
        j=1;
    end
end
end

尽管我的方法很简单，但在性能方面却排在第二位：

-------------------- With CUMSUM + ACCUMARRAY
Elapsed time is 0.264428 seconds.
-------------------- With FIND + ARRAYFUN
Elapsed time is 0.407963 seconds.
-------------------- With CUMSUM + ARRAYFUN
Elapsed time is 18.337940 seconds.
-------------------- SIMPLE
Elapsed time is 0.271942 seconds.

对于这个问题，一个方便的函数是

cumsum

，它可以创建切割数组的累积和。生成输出单元格数组的代码如下所示：

vec = [1 2 3 4 5 6 7 8 9 10];
cut = [0 0 0 1 0 0 0 0 1 0];

cutsum = cumsum(cut);
cutsum(cut == 1) = NaN;  %Don't include the cut indices themselves
sumvals = unique(cutsum);      % Find the values to use in indexing vec for the output
sumvals(isnan(sumvals)) = [];  %Remove NaN values from sumvals
output = {};
for i=1:numel(sumvals)
    output{i} = vec(cutsum == sumvals(i)); %#ok<SAGROW>
end

这很好，因为它不会最终增长输出单元阵列

此例程的关键功能是变量

cutsum

，其结果如下所示：

cutsum =
     0     0     0   NaN     1     1     1     1   NaN     2

然后我们需要做的就是使用它创建索引，从原始

vec

数组中提取数据。我们从零循环到最大值并提取匹配值。请注意，此例程处理一些可能出现的情况。例如，它在

cut

数组的最开始和最末尾处理1个值，并在

cut

数组中优雅地处理重复的值，而不在输出中创建空数组。这是因为使用了

unique

来创建要在

cutsum

中搜索的值集，并且我们在

sumvals

数组中抛出了NaN值

您可以使用

-1

而不是

NaN

作为剪切位置不使用的信号标志，但我喜欢NaN的可读性。-1值可能更有效，因为您所要做的就是从sumvals数组中截断第一个元素。我喜欢使用NaN作为信号标志

这是一个单元格数组的输出，结果如下：

output{1} =
     1     2     3
output{2} =
     5     6     7     8
output{3} =
    10

我们需要处理一些奇怪的情况。考虑一下情况：

vec = [1 2 3 4 5 6 7 8 9 10 11 12 13 14];
cut = [1 0 0 1 1 0 0 0 0 1  0  0  0  1];

其中有重复的1，以及开头和结尾的1。此例程在不使用任何空集的情况下正确处理所有这些：

output{1} = 
     2     3
output{2} =
     6     7     8     9
output{3} = 
    11    12    13

您可以通过和的组合来实现这一点：

那么这是如何工作的呢？第一行定义了你的输入向量，第二行找到这个向量中有多少元素，第三行表示你的

cut

向量，它定义了我们需要在向量中剪切的位置。接下来，我们使用

find

确定

cut

中与向量中的分割点相对应的非零位置。如果您注意到，拆分点决定了我们需要停止收集元素并开始收集元素的位置

然而，我们需要考虑向量的开始和结束

ind_after

告诉我们需要开始收集值的位置，

ind_after

告诉我们需要停止收集值的位置。要计算这些起始位置和结束位置，只需分别取

find

和add和subtract 1的结果

ind_之后

和

ind_之前

中的每个对应位置告诉我们需要在哪里开始和停止收集值。为了适应向量的开头，

ind_after

需要在开头插入索引1，因为索引1是我们应该从开头开始收集值的地方。类似地，

需要在

ind_的末尾插入，因为这是我们需要停止在数组末尾收集值的地方
现在对于

之后的ind_和之前的ind_，存在一种退化情况，其中切点可能位于向量的末端或开始处。如果是这种情况，那么减1或加1将生成一个超出边界的开始和停止位置。我们在第4行和第5行代码中检查这一点，并根据我们是在数组的开头还是结尾，简单地将它们设置为1或

最后一行代码使用

arrayfun

并在

之后和ind\u之前迭代每对ind\u，以切片到向量中。每个结果都被放入一个单元格数组中，我们的输出如下

我们可以通过在cut
的开始和结束处放置一个1以及介于两者之间的一些值来检查退化情况：
vec = [1 2 3 4 5 6 7 8 9 10];
cut = [1 0 0 1 0 0 0 1 0 1];

使用此示例和上述代码，我们得到：
>> celldisp(out)

out{1} =

     1

out{2} =

     2     3         

out{3} =

     5     6     7

out{4} =

     9         

out{5} =

    10

解决方案代码
您可以使用&来获得有效的解决方案-
%// Create ID/labels for use with accumarray later on
id = cumsum(cut)+1   

%// Mask to get valid values from cut and vec corresponding to ones in cut
mask = cut==0        

%// Finally get the output with accumarray using masked IDs and vec values 
out = accumarray(id(mask).',vec(mask).',[],@(x) {x})


标杆管理
下面是在列出的解决此问题的三种最常用方法上使用大量输入时的一些性能数据-
N = 100000;  %// Input Datasize

vec = randi(100,1,N); %// Random inputs
cut = randi(2,1,N)-1;

disp('-------------------- With CUMSUM + ACCUMARRAY')
tic
id = cumsum(cut)+1;
mask = cut==0;
out = accumarray(id(mask).',vec(mask).',[],@(x) {x});
toc

disp('-------------------- With FIND + ARRAYFUN')
tic
N = numel(vec);
ind = find(cut);
ind_before = [ind-1 N]; ind_before(ind_before < 1) = 1;
ind_after = [1 ind+1]; ind_after(ind_after > N) = N;
out = arrayfun(@(x,y) vec(x:y), ind_after, ind_before, 'uni', 0);
toc

disp('-------------------- With CUMSUM + ARRAYFUN')
tic
cutsum = cumsum(cut);
cutsum(cut == 1) = NaN;  %Don't include the cut indices themselves
sumvals = unique(cutsum);      % Find the values to use in indexing vec for the output
sumvals(isnan(sumvals)) = [];  %Remove NaN values from sumvals
output = arrayfun(@(val) vec(cutsum == val), sumvals, 'UniformOutput', 0);
toc


特殊情况场景：在您可能运行了1
的情况下，您需要修改下面列出的几项内容-
%// Mask to get valid values from cut and vec corresponding to ones in cut
mask = cut==0  

%// Setup IDs differently this time. The idea is to have successive IDs.
id = cumsum(cut)+1
[~,~,id] = unique(id(mask))
      
%// Finally get the output with accumarray using masked IDs and vec values 
out = accumarray(id(:),vec(mask).',[],@(x) {x})

这种情况下的示例运行-
>> vec
vec =
     1     2     3     4     5     6     7     8     9    10
>> cut
cut =
     1     0     0     1     1     0     0     0     1     0
>> celldisp(out)
out{1} =
     2
     3
out{2} =
     6
     7
     8
out{3} =
    10

还有另一种方式，但这次没有任何循环或累积
lengths = diff(find([1 cut 1])) - 1;    % assuming a row vector
lengths = lengths(lengths > 0);
data = vec(~cut);
result = mat2cell(data, 1, lengths);    % also assuming a row vector

diff（find（…）
构造为我们提供了从每个标记到下一个标记的距离-我们使用[1 cut 1]
附加边界标记，以捕捉任何接触端点的零运行。但是，每个长度都包含它的标记，因此我们减去1来说明这一点，并删除任何只覆盖连续标记的长度，这样我们就不会在输出中得到任何不需要的空单元格
对于数据，我们屏蔽了与标记相对应的任何元素，因此我们只拥有要划分的有效部分。最后，由于数据已准备好拆分，以及拆分数据的长度，这正是mat2cell
的用途
-------------------- With CUMSUM + ACCUMARRAY
Elapsed time is 0.068102 seconds.
-------------------- With FIND + ARRAYFUN
Elapsed time is 0.117953 seconds.
-------------------- With CUMSUM + ARRAYFUN
Elapsed time is 12.560973 seconds.

%// Mask to get valid values from cut and vec corresponding to ones in cut
mask = cut==0  

%// Setup IDs differently this time. The idea is to have successive IDs.
id = cumsum(cut)+1
[~,~,id] = unique(id(mask))
      
%// Finally get the output with accumarray using masked IDs and vec values 
out = accumarray(id(:),vec(mask).',[],@(x) {x})

>> vec
vec =
     1     2     3     4     5     6     7     8     9    10
>> cut
cut =
     1     0     0     1     1     0     0     0     1     0
>> celldisp(out)
out{1} =
     2
     3
out{2} =
     6
     7
     8
out{3} =
    10

lengths = diff(find([1 cut 1])) - 1;    % assuming a row vector
lengths = lengths(lengths > 0);
data = vec(~cut);
result = mat2cell(data, 1, lengths);    % also assuming a row vector

-------------------- With CUMSUM + ACCUMARRAY
Elapsed time is 0.272810 seconds.
-------------------- With FIND + ARRAYFUN
Elapsed time is 0.436276 seconds.
-------------------- With CUMSUM + ARRAYFUN
Elapsed time is 17.112259 seconds.
-------------------- With mat2cell
Elapsed time is 0.084207 seconds.