C++ 使用犰狳从基于行和列索引的矩阵中提取元素_C++_R_Rcpp_Armadillo

C++ 使用犰狳从基于行和列索引的矩阵中提取元素

c++ r

C++ 使用犰狳从基于行和列索引的矩阵中提取元素,c++,r,rcpp,armadillo,C++,R,Rcpp,Armadillo,在R中，我可以根据它们的索引提取矩阵元素，如下所示 > m <- matrix(1:6, nrow = 3) > m [,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6 > row_index <- c(1, 2) > col_index <- c(2, 2) > m[cbind(row_index, col_index)] [1] 4 5 >m [,1] [,2] [1,]

在R中，我可以根据它们的索引提取矩阵元素，如下所示

> m <- matrix(1:6, nrow = 3)
> m
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6
> row_index <- c(1, 2)
> col_index <- c(2, 2)
> m[cbind(row_index, col_index)]
[1] 4 5

>m
[,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6
>行索引列索引m[cbind（行索引，列索引）]
[1] 4 5

Armadillo/Rcpp:：Armadillo有没有一种本地的方法可以做到这一点？我能做的最好的事情是使用行和列索引来计算元素索引的自定义函数（见下文）。我最担心的是自定义函数的性能不好

#include <RcppArmadillo.h>
using namespace Rcpp;

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
NumericVector Rsubmatrix(arma::uvec rowInd, arma::uvec colInd, arma::mat m) {
  arma::uvec ind = (colInd - 1) * m.n_rows + (rowInd - 1);
  arma::vec ret = m.elem(ind);
  return wrap(ret);
}

/*** R
Rsubmatrix(row_index, col_index, m)
/

#包括
使用名称空间Rcpp；
//[[Rcpp:：depends（RcppArmadillo）]]
//[[Rcpp:：导出]]
数值向量Rsubmatrix（arma:：uvec rowInd，arma:：uvec colInd，arma:：mat m）{
arma：：uvec ind=（colInd-1）*m.n_行+（rowInd-1）；
arma：：vec ret=m.elem（ind）；
返回包装（ret）；
}
/***R
Rsubmatrix（行索引、列索引、m）
/

来自：

但这似乎只返回矩阵块。对于非单连通区域，我认为您的解决方案是最好的，但您并不真正需要函数

m.elem((colInd - 1) * m.n_rows + (rowInd - 1));

返回没有任何问题的向量。为了清楚起见，您可以定义一个函数来处理行+列到索引的转换

inline arma::uvec arr2ind(arma::uvec c, arma::uvec r, int nrow) 
{ 
  return c * nrow + r;
}
// m.elem(arr2ind(colInd - 1, rowInd - 1, m.n_rows));

让我们试试这个

特别是，您可以通过编写自己的循环来使用

（i，j）

子集操作符，通过

rowInd

和

colInd

进行子集。否则，唯一的另一个选择是你提出的解决方案，开始这个问题

#include <RcppArmadillo.h>
using namespace Rcpp;

// [[Rcpp::depends(RcppArmadillo)]]

// Optimized OP method
// [[Rcpp::export]]
arma::vec Rsubmatrix(const arma::mat& m, const arma::uvec& rowInd, const arma::uvec& colInd) {
  return  m.elem((colInd - 1) * m.n_rows + (rowInd - 1));
}

// Proposed Alternative
// [[Rcpp::export]]
arma::rowvec get_elements(const arma::mat& m, const arma::uvec& rowInd, const arma::uvec& colInd){

  unsigned int n = rowInd.n_elem;

  arma::rowvec out(n);

  for(unsigned int i = 0; i < n; i++){
    out(i) = m(rowInd[i]-1,colInd[i]-1);
  }

  return out;
}

我们有：

get_elements(m, row_index, col_index)

给予：

     [,1] [,2]
[1,]    4    5

编辑

微基准：

microbenchmark(Rsubmatrix(m, row_index, col_index), get_elements(m, row_index, col_index), times = 1e4)

给出：

Unit: microseconds
                                  expr   min    lq     mean median    uq      max neval
   Rsubmatrix(m, row_index, col_index) 2.836 3.111 4.129051  3.281 3.502 5016.652 10000
 get_elements(m, row_index, col_index) 2.699 2.947 3.436844  3.115 3.335  716.742 10000

这两种方法在时间上都很接近。注意，后者应该更好，因为它避免了两个单独的循环（1.计算&2.子集）和一个额外的临时向量来存储结果

编辑根据armadillo

7.200.0

release，

sub2ind（）

函数已获得采用矩阵表示法的能力。此函数通过

2 x n

矩阵获取矩阵下标，其中

表示要子集的元素数，并将其转换为元素表示法

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::rowvec matrix_locs(arma::mat M, arma::umat locs) {

    arma::uvec eids = sub2ind( size(M), locs ); // Obtain Element IDs
    arma::vec v  = M.elem( eids );              // Values of the Elements

    return v.t();                               // Transpose to mimic R
}

#包括
//[[Rcpp:：depends（RcppArmadillo）]]
//[[Rcpp:：导出]]
arma:：rowvec矩阵_locs（arma:：mat M，arma:：umat locs）{
arma:：uvec eids=sub2ind（大小（M），locs）；//获取元素ID
arma:：vec v=M.elem（eids）；//元素的值
返回v.t（）；//转置到模拟R
}

呼叫R:

cpp\u locs可以只使用.at（）
成员函数。.at（n）
需要元素索引。我的问题是，是否有一种原生犰狳方法，它将行索引和列索引作为输入。从它看来，这两种方法都类似于。at（
和操作符（
都最多使用三个元素访问器（i，j，k）

（i，j，k）是标量。我的输入是向量。我尝试过这个方法。

.submat（）

实际上会吸引索引中出现的所有行和列。请尝试以下操作：

/[[Rcpp:：export]]numerimatrix test（arma:：mat m，arma:：uvec v1，arma:：uvec v2）{arma:：mat ret=m.submat（v1，v2）；return wrap（ret）；}/***R m您的问题不是很具体。对于非连续元素，我认为您的解决方案是标准方法（但您可能希望传递引用，以避免复制大型对象）。我不想说“非连续”在这个问题上，因为这是文档对您的解决方案所说的。因此我希望您的解决方案也能获得非连续元素，但它没有。使用：test（m，c（0，1），c（1））
有那么糟糕吗？@HeisenbergYou应该举一个不连续的指数集的例子，这将大大澄清这个问题。我可能会做一些基准测试，看看哪个更快。但从R来看，我的直觉是，循环成本更高吗？@Heisenberg最初的基准测试有这种方法稍微快一点（微秒）的原因。不过，您可能应该在更大的矩阵上试用。@Heisenberg请参阅上面的编辑，了解如何使用新实现的sub2ind（）的矢量化版本。
microbenchmark(Rsubmatrix(m, row_index, col_index), get_elements(m, row_index, col_index), times = 1e4)

Unit: microseconds
                                  expr   min    lq     mean median    uq      max neval
   Rsubmatrix(m, row_index, col_index) 2.836 3.111 4.129051  3.281 3.502 5016.652 10000
 get_elements(m, row_index, col_index) 2.699 2.947 3.436844  3.115 3.335  716.742 10000

#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
arma::rowvec matrix_locs(arma::mat M, arma::umat locs) {

    arma::uvec eids = sub2ind( size(M), locs ); // Obtain Element IDs
    arma::vec v  = M.elem( eids );              // Values of the Elements

    return v.t();                               // Transpose to mimic R
}

cpp_locs <- locs - 1       # Shift indices from R to C++

(cpp_locs <- t(cpp_locs))  # Transpose matrix for 2 x n form

matrix_locs(M, cpp_locs)   # Subset the matrix