accumarray-Matlab的VAL字段中有2列_Matlab_Matrix_Vectorization_Covariance

accumarray-Matlab的VAL字段中有2列

matlab matrix

accumarray-Matlab的VAL字段中有2列,matlab,matrix,vectorization,covariance,Matlab,Matrix,Vectorization,Covariance,我不太习惯在Matlab中使用accumarray函数，尽管我已经开始欣赏它的强大功能！我想知道是否可以在accumarray函数的VAL字段中输入2列。请看- sz = 3 ; % num_rows for each ID mat1 = [1 20 ; 1 40 ; 1 50 ; 2 10 ; 2 100 ; 2 110] ; % Col1 is ID, Col2 is Value idx = [30 1000 ; 30 1200 ; 30 1500 ; 30 1000 ; 30 1200

我不太习惯在Matlab中使用accumarray函数，尽管我已经开始欣赏它的强大功能！我想知道是否可以在accumarray函数的VAL字段中输入2列。请看-

sz = 3  ; % num_rows for each ID
mat1 = [1 20 ; 1 40 ; 1 50 ; 2 10 ; 2 100 ; 2 110] ; % Col1 is ID, Col2 is Value
idx  = [30 1000 ; 30 1200 ; 30 1500 ; 30 1000 ; 30 1200 ; 30 1500 ] ; 
% col1: index ID, col2: value

mat1是ID返回，而idx是索引返回。为简单起见，重复idx返回以匹配mat1。mat1中的所有ID都有相同的行。甚至idx也有相同的行

[~,~,n] = unique(mat1(:,1), 'rows', 'last') ;
fncovariance = @(x,y) (x.*y)/sz ;
accumarray(n, [x(:,2) y(:,2)], [], fncovariance) % --> FAILS as VAL is not-vector!

您可以看到，我试图计算协方差（cov（x，y，1）），但不能直接使用Matlab函数，因为mat1有ID，我需要每个ID w.r.t索引的协方差

安斯马特：

    1 2444.4
    2 7888.9

简短回答是否。在

accumarray（）

帮助中，关键部分是：

VAL必须是长度相同的数字、逻辑或字符向量由于SUBS.VAL中的行数也可能是一个标量，其值为对所有潜艇行重复此操作。”

这意味着你甚至不能用细胞来伪造它

但是，如果您将ID放入它们自己的索引变量中，然后重新调整数据的形状，使不同ID对应的数据位于不同的列中，那么

bsxfun（）

可以有效地处理此问题。作为参考，我还包括了一个矩阵数学方法，一个使用

cov（）

的简单

For

循环方法，以及一个使用自定义

fncarvariance（）

函数的

cellfun（）

方法（注意，我在上面修改了它）

%bsxfun方法
平均值（bsxfun（@times，ret，idx））-bsxfun（@times，mean（ret），mean（idx））

%矩阵数学
idx'*ret/长度（idx）-平均值（ret）*平均值（idx）

循环方法的

%
id_cov=零（1，长度（IDs））；
对于i=1：长度（ID）
tmp=cov（ret（：，i），idx，1）；
id_cov（i）=tmp（2,1）；
终止
身份证

%cellfun方法
ret_单元=num2单元（ret，1）；
idx_cell=num2cell（repmat（idx，1，长度（IDs）），1）；
cellfun（FN协方差、ret_单元、idx_单元）

如果您在这些不同的方法中模拟更多的数据和时间，则

bsxfun（）

方式是最快的：

sz    = 10;
n_ids = 100;

IDs = 1:n_ids;
ret = randi(1000, sz, n_ids);
idx = randi(1000, sz, 1);

运行时间为0.001292秒。
运行时间为0.001523秒。
运行时间为0.009625秒。
运行时间为0.011454秒。

您可能感兴趣的最后一个选项是统计工具箱中的

grpstats（）

函数，它允许您根据分组变量清除任意统计信息。

最后一行代码中的

和

是什么？你是说

mat1

和

idx

？@John是的。但我只是指matlab的内置函数。对我来说，mat1有很多ID，y有索引（比如纽约证交所）回报。所以一个简单的cov（x，y，1）是没有用的。谢谢

sz    = 10;
n_ids = 100;

IDs = 1:n_ids;
ret = randi(1000, sz, n_ids);
idx = randi(1000, sz, 1);