如何使用Rcpp避免在r中使用for循环
我有一个xts格式的数据(data)如下所示:如何使用Rcpp避免在r中使用for循环,r,function,for-loop,rcpp,xts,R,Function,For Loop,Rcpp,Xts,我有一个xts格式的数据(data)如下所示: A 2008-01-14 09:29:59 10 2008-01-14 09:29:59 0.1 2008-01-14 09:30:00 0.9 2008-01-14 09:30:00 0.1 2008-01-14 09:30:00 0.2 2008-01-14 09:30:00
A
2008-01-14 09:29:59 10
2008-01-14 09:29:59 0.1
2008-01-14 09:30:00 0.9
2008-01-14 09:30:00 0.1
2008-01-14 09:30:00 0.2
2008-01-14 09:30:00 0.4
2008-01-14 09:30:00 0.6
2008-01-14 09:30:00 0.7
2008-01-14 09:30:02 1.5
2008-01-14 09:30:06 0.1
2008-01-14 09:30:06 0.1
2008-01-14 09:30:07 0.9
2008-01-14 09:30:07 0.2
2008-01-14 09:30:10 0.4
2008-01-14 09:30:10 0.3
2008-01-14 09:30:25 1.5
任何列或行元素中都没有模式
数据由POSIXct类对象索引。我正在创建名为“1秒”、“3秒”的新列。对于“1秒”列,对于每一行,我希望根据xts时间对象在接下来的1秒内找到下一个观测值,并记录该行的“A”值。如果在接下来的几秒钟内没有观察到,请将NA放入该行的$1秒数据中
类似地,对于“3second”列,对于每一行,我希望根据xts时间对象在接下来的3秒内找到前导观测值。如果在接下来的3秒内有多行具有相同的时间戳,则仅使用最后一次观察
如果在接下来的3秒钟内没有观察到,请在该行的数据$3秒内输入NA。
例如,我期望得到以下结果:
B 1second 3second
2008-01-14 09:29:59 10 0.7 1.5
2008-01-14 09:29:59 0.1 0.7 1.5
2008-01-14 09:30:00 0.9 NA 1.5
2008-01-14 09:30:00 0.1 NA 1.5
2008-01-14 09:30:00 0.2 NA 1.5
2008-01-14 09:30:00 0.4 NA 1.5
2008-01-14 09:30:00 0.6 NA 1.5
2008-01-14 09:30:00 0.7 NA 1.5
2008-01-14 09:30:02 1.5 NA NA
2008-01-14 09:30:06 0.1 0.2 0.2
2008-01-14 09:30:06 0.1 0.2 0.2
2008-01-14 09:30:07 0.9 NA 0.3
2008-01-14 09:30:07 0.2 NA 0.3
2008-01-14 09:30:10 0.4 NA 0.3
2008-01-14 09:30:10 0.3 NA NA
2008-01-14 09:30:25 1.5 NA NA
这是我当前的代码,它可以工作,但速度很慢
TimeStmp is the POSIXct object.
TimeHorizon<-c(1,3)
for( j in 1:nrow(data)){
a<-sapply(TimeHorizon,function(x) which(TimeStmp==TimeStmp[j] +x))
for( k in 1:length(a)){
if (length(a[[k]]>0)){
data[j,k+1]<-(data$B)[last(a[[k]])]
}
}
}
TimeStmp是POSIXct对象。
TimeHorizon对代码不太满意,但这可能是一种方法:
temp1 <- test[! duplicated(test$timestamp, fromLast = T), ]
for (i in c(0,rep(1,3))) {
temp1$timestamp <- temp1$timestamp - i
test <- merge(test, temp1, by = "timestamp", all.x = T)
}
colnames(test) <- c("timestamp", "B", "0second", "1second", "2second", "3second")
test$`3second` <- test[-1][cbind(1:nrow(test), max.col(!is.na(test[-1]), "last"))]
test$`3second`[shift(test$timestamp,1,type = "lead") - test$timestamp > 3 | is.na(shift(test$timestamp,1,type = "lead") - test$timestamp)] <- NA
test <- test[c("timestamp", "B", "1second", "3second")]
test
# timestamp B 1second 3second
# 1 2008-01-14 09:29:59 0.1 0.7 1.5
# 2 2008-01-14 09:29:59 10.0 0.7 1.5
# 3 2008-01-14 09:30:00 0.9 NA 1.5
# 4 2008-01-14 09:30:00 0.1 NA 1.5
# 5 2008-01-14 09:30:00 0.2 NA 1.5
# 6 2008-01-14 09:30:00 0.4 NA 1.5
# 7 2008-01-14 09:30:00 0.6 NA 1.5
# 8 2008-01-14 09:30:00 0.7 NA 1.5
# 9 2008-01-14 09:30:02 1.5 NA NA
# 10 2008-01-14 09:30:06 0.1 0.2 0.2
# 11 2008-01-14 09:30:06 0.1 0.2 0.2
# 12 2008-01-14 09:30:07 0.9 NA 0.3
# 13 2008-01-14 09:30:07 0.2 NA 0.3
# 14 2008-01-14 09:30:10 0.3 NA 0.3
# 15 2008-01-14 09:30:10 0.4 NA NA
# 16 2008-01-14 09:30:25 1.5 NA NA
temp1如果需要Rcpp解决方案,可以使用
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector name_me(List df, double nsec) {
NumericVector TimeStmp = df["TimeStmp"];
NumericVector B = df["B"];
int n = B.size();
int i, j, k, ndup;
double time;
NumericVector res(n);
for (i = 0; i < n; i++) {
// get last for same second
for (ndup = 0; (i+1) < n; i++, ndup++) {
if (TimeStmp[i+1] != TimeStmp[i]) break;
}
// get last value within nsec
time = TimeStmp[i] + nsec;
for (j = i+1; j < n; j++) {
if (TimeStmp[j] > time) break;
}
// fill all previous ones with same value
res[i] = (j == (i+1)) ? NA_REAL : B[j-1];
for (k = 1; k <= ndup; k++) res[i-k] = res[i];
}
return res;
}
请注意,您的第(n-2)行中存在3秒钟的不一致性。不鼓励可能重复的纯代码答案,因为它们没有解释如何解决问题。考虑更新你的答案来解释这个问题,以及它是如何解决问题的。请复习
name_me(df, 1)
name_me(df, 3)