R 多列滚动回归_R_Apply_Xts_Linear Regression_Rolling Computation

R 多列滚动回归

R 多列滚动回归,r,apply,xts,linear-regression,rolling-computation,R,Apply,Xts,Linear Regression,Rolling Computation,我遇到了一个问题：如何找到最有效的方法来计算具有多列的xts对象上的滚动线性回归。我在stackoverflow上搜索并阅读了之前的几个问题这很接近，但在我看来还不够，因为我想在所有回归中因变量不变的情况下计算多元回归。我试图用随机数据重现一个示例： require(xts) require(RcppArmadillo) # Load libraries data <- matrix(sample(1:10000, 1500), 1500, 5, byrow = TRUE) # R

我遇到了一个问题：如何找到最有效的方法来计算具有多列的xts对象上的滚动线性回归。我在stackoverflow上搜索并阅读了之前的几个问题

这很接近，但在我看来还不够，因为我想在所有回归中因变量不变的情况下计算多元回归。我试图用随机数据重现一个示例：

require(xts)
require(RcppArmadillo)  # Load libraries

data <- matrix(sample(1:10000, 1500), 1500, 5, byrow = TRUE)  # Random data
data[1000:1500, 2] <- NA  # insert NAs to make it more similar to true data
data <- xts(data, order.by = as.Date(1:1500, origin = "2000-01-01"))

NR <- nrow(data)  # number of observations
NC <- ncol(data)  # number of factors
obs <- 30  # required number of observations for rolling regression analysis
info.names <- c("res", "coef")

info <- array(NA, dim = c(NR, length(info.names), NC))
colnames(info) <- info.names

require（xts）
需要（RcppArmadillo）#加载库
数据若你们深入到线性回归的数学层面，它应该是相当快的。如果X是自变量，Y是因变量。系数由下式给出：
Beta=inv（t（X）%*%X）%*%（t（X）%*%Y）

我有点困惑，你想要哪个变量是相依变量，哪个是独立变量，但希望下面解决一个类似的问题也能帮助你
在下面的示例中，我使用1000个变量，而不是原来的5个变量，并且不引入任何NA
require(xts)

data <- matrix(sample(1:10000, 1500000, replace=T), 1500, 1000, byrow = TRUE)  # Random data
data <- xts(data, order.by = as.Date(1:1500, origin = "2000-01-01"))

NR <- nrow(data)  # number of observations
NC <- ncol(data)  # number of factors
obs <- 30  # required number of observations for rolling regression analysis

require（xts）
数据使用rollRegres
包，这里有一种更快的方法
库（xts）
图书馆（RcppArmadillo）
#####
#模拟数据
种子集（50554709）
数据通过不运行两次回归，您可以使其速度提高2倍。。。我已经把它编辑到你的问题里了。当然可以！在欧洲已经很晚了。谢谢你，约书亚。这一变化使性能提高了2-2.5倍。但是，你认为这段代码对于每天2500个观测数据和1000个因素有足够的性能吗？或者，与上述方法相比，您是否意识到使用rollapply在性能方面有任何提高？我想，如果数据集变得非常大，你必须应用递归最小二乘滤波器或其他相关的东西——对此有什么想法吗？
require(xts)

data <- matrix(sample(1:10000, 1500000, replace=T), 1500, 1000, byrow = TRUE)  # Random data
data <- xts(data, order.by = as.Date(1:1500, origin = "2000-01-01"))

NR <- nrow(data)  # number of observations
NC <- ncol(data)  # number of factors
obs <- 30  # required number of observations for rolling regression analysis

library(TTR)

loop.begin.time <- Sys.time()

in.dep.var <- data[,1]
xx <- TTR::runSum(in.dep.var*in.dep.var, obs)
coeffs <- do.call(cbind, lapply(data, function(z) {
    xy <- TTR::runSum(z * in.dep.var, obs)
    xy/xx
}))

loop.end.time <- Sys.time()

print(loop.end.time - loop.begin.time)  # prints the loop runtime

res.array = array(NA, dim=c(NC, NR, obs))
for(z in seq(obs)) {
  res.array[,,z] = coredata(data - lag.xts(coeffs, z-1) * as.numeric(in.dep.var))
}
res.sd <- apply(res.array, c(1,2), function(z) z / sd(z))

# are you sure you want the residual of the first and not the last
# observation in each window?