如何使用Rcpp避免在r中使用for循环

如何使用Rcpp避免在r中使用for循环,r,function,for-loop,rcpp,xts,R,Function,For Loop,Rcpp,Xts,我有一个xts格式的数据(data)如下所示: A 2008-01-14 09:29:59 10 2008-01-14 09:29:59 0.1 2008-01-14 09:30:00 0.9 2008-01-14 09:30:00 0.1 2008-01-14 09:30:00 0.2 2008-01-14 09:30:00

我有一个xts格式的数据(data)如下所示:

                              A
2008-01-14 09:29:59           10 
2008-01-14 09:29:59           0.1
2008-01-14 09:30:00           0.9
2008-01-14 09:30:00           0.1
2008-01-14 09:30:00           0.2
2008-01-14 09:30:00           0.4
2008-01-14 09:30:00           0.6
2008-01-14 09:30:00           0.7
2008-01-14 09:30:02           1.5
2008-01-14 09:30:06           0.1
2008-01-14 09:30:06           0.1
2008-01-14 09:30:07           0.9
2008-01-14 09:30:07           0.2
2008-01-14 09:30:10           0.4
2008-01-14 09:30:10           0.3
2008-01-14 09:30:25           1.5 
任何列或行元素中都没有模式

数据由POSIXct类对象索引。我正在创建名为“1秒”、“3秒”的新列。对于“1秒”列,对于每一行,我希望根据xts时间对象在接下来的1秒内找到下一个观测值,并记录该行的“A”值。如果在接下来的几秒钟内没有观察到,请将NA放入该行的$1秒数据中

类似地,对于“3second”列,对于每一行,我希望根据xts时间对象在接下来的3秒内找到前导观测值。如果在接下来的3秒内有多行具有相同的时间戳,则仅使用最后一次观察

如果在接下来的3秒钟内没有观察到,请在该行的数据$3秒内输入NA。 例如,我期望得到以下结果:

                              B    1second  3second
2008-01-14 09:29:59           10    0.7      1.5        
2008-01-14 09:29:59           0.1   0.7      1.5
2008-01-14 09:30:00           0.9   NA       1.5
2008-01-14 09:30:00           0.1   NA       1.5
2008-01-14 09:30:00           0.2   NA       1.5
2008-01-14 09:30:00           0.4   NA       1.5
2008-01-14 09:30:00           0.6   NA       1.5
2008-01-14 09:30:00           0.7   NA       1.5
2008-01-14 09:30:02           1.5   NA       NA
2008-01-14 09:30:06           0.1   0.2      0.2
2008-01-14 09:30:06           0.1   0.2      0.2
2008-01-14 09:30:07           0.9   NA       0.3
2008-01-14 09:30:07           0.2   NA       0.3
2008-01-14 09:30:10           0.4   NA       0.3
2008-01-14 09:30:10           0.3   NA       NA
2008-01-14 09:30:25           1.5   NA       NA
这是我当前的代码,它可以工作,但速度很慢

TimeStmp is the POSIXct object.
      TimeHorizon<-c(1,3)
      for( j in 1:nrow(data)){
        a<-sapply(TimeHorizon,function(x) which(TimeStmp==TimeStmp[j] +x)) 
        for( k in 1:length(a)){
          if (length(a[[k]]>0)){
            data[j,k+1]<-(data$B)[last(a[[k]])]
          }
        }
      }
TimeStmp是POSIXct对象。

TimeHorizon对代码不太满意,但这可能是一种方法:

temp1 <- test[! duplicated(test$timestamp, fromLast = T), ]
for (i in c(0,rep(1,3))) {
  temp1$timestamp <- temp1$timestamp - i
  test <- merge(test, temp1, by = "timestamp", all.x = T)
}
colnames(test) <- c("timestamp", "B", "0second", "1second", "2second", "3second")
test$`3second` <- test[-1][cbind(1:nrow(test), max.col(!is.na(test[-1]), "last"))]
test$`3second`[shift(test$timestamp,1,type = "lead") - test$timestamp > 3 | is.na(shift(test$timestamp,1,type = "lead") - test$timestamp)] <- NA
test <- test[c("timestamp", "B", "1second", "3second")]
test
#              timestamp    B 1second 3second
# 1  2008-01-14 09:29:59  0.1     0.7     1.5
# 2  2008-01-14 09:29:59 10.0     0.7     1.5
# 3  2008-01-14 09:30:00  0.9      NA     1.5
# 4  2008-01-14 09:30:00  0.1      NA     1.5
# 5  2008-01-14 09:30:00  0.2      NA     1.5
# 6  2008-01-14 09:30:00  0.4      NA     1.5
# 7  2008-01-14 09:30:00  0.6      NA     1.5
# 8  2008-01-14 09:30:00  0.7      NA     1.5
# 9  2008-01-14 09:30:02  1.5      NA      NA
# 10 2008-01-14 09:30:06  0.1     0.2     0.2
# 11 2008-01-14 09:30:06  0.1     0.2     0.2
# 12 2008-01-14 09:30:07  0.9      NA     0.3
# 13 2008-01-14 09:30:07  0.2      NA     0.3
# 14 2008-01-14 09:30:10  0.3      NA     0.3
# 15 2008-01-14 09:30:10  0.4      NA      NA
# 16 2008-01-14 09:30:25  1.5      NA      NA

temp1如果需要Rcpp解决方案,可以使用

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector name_me(List df, double nsec) {

  NumericVector TimeStmp = df["TimeStmp"];
  NumericVector B        = df["B"];
  int n = B.size();
  int i, j, k, ndup;
  double time;

  NumericVector res(n);

  for (i = 0; i < n; i++) {

    // get last for same second
    for (ndup = 0; (i+1) < n; i++, ndup++) {
      if (TimeStmp[i+1] != TimeStmp[i]) break;
    }

    // get last value within nsec
    time = TimeStmp[i] + nsec;
    for (j = i+1; j < n; j++) {
      if (TimeStmp[j] > time) break;
    }

    // fill all previous ones with same value
    res[i] = (j == (i+1)) ? NA_REAL : B[j-1];
    for (k = 1; k <= ndup; k++) res[i-k] = res[i];
  }

  return res;
}

请注意,您的第(n-2)行中存在3秒钟的不一致性。

不鼓励可能重复的纯代码答案,因为它们没有解释如何解决问题。考虑更新你的答案来解释这个问题,以及它是如何解决问题的。请复习
name_me(df, 1)
name_me(df, 3)