如何避免函数which()中具有多个条件的for循环

如何避免函数which()中具有多个条件的for循环,r,performance,for-loop,multiple-conditions,R,Performance,For Loop,Multiple Conditions,我有一个25年的数据集,看起来类似于以下内容: date name value tag 1 2014-12-01 f -0.338578654 12 2 2014-12-01 a 0.323379254 4 3 2014-12-01 f 0.004163806 9 4 2014-12-01 f 1.365219477 2 5 2014-12-01 l -1.225602543 7 6 2014-12-01

我有一个25年的数据集,看起来类似于以下内容:

        date name        value tag
1 2014-12-01    f -0.338578654  12
2 2014-12-01    a  0.323379254   4
3 2014-12-01    f  0.004163806   9
4 2014-12-01    f  1.365219477   2
5 2014-12-01    l -1.225602543   7
6 2014-12-01    d -0.308544089   9
这就是如何复制它:

set.seed(9)
date <- rep(seq(as.Date("1990-01-01"), as.Date("2015-01-1"), by="months"), each=50)
N <- length(date)
name <- sample(letters, N, replace=T)
value <- rnorm(N)
tag <- sample(c(1:50), N, replace=T)
mydata <- data.frame(date, name, value, tag)
head(mydata)
set.seed(9)

日期您可以使用软件包
restrape2
中的
dcast
,并使用自定义函数对值求和:

library(reshape2)
dcast(mydata, name~tag, value.var='value', fun.aggregate=sum)
或者干脆
xtabs
,base
R

xtabs(value~name+tag, mydata)
一些基准:

funcPer = function(){
    S <- matrix(data=NA, nrow=length(unique(mydata$tag)), ncol=length(unique(mydata$name)))
    for(i in 1:nrow(S)){
      for (j in 1:ncol(S)){
        foo <- which(mydata$tag == unique(mydata$tag)[i] & mydata$name == unique(mydata$name)[j])
        S[i,j] <- sum(mydata$value[foo])
      }
    }
}

colonel1 = function() dcast(mydata, name~tag, value.var='value', fun.aggregate=sum)

colonel2 = function() xtabs(value~name+tag, mydata)

#> system.time(colonel1())
#  user  system elapsed 
#   0.01    0.00    0.01 
#> system.time(colonel2())
#   user  system elapsed 
#   0.05    0.00    0.05 
#> system.time(funcPer())
#   user  system elapsed 
#   4.67    0.00    4.82 
funcPer=function(){
S system.time(funcPer())
#用户系统运行时间
#   4.67    0.00    4.82 
funcPer = function(){
    S <- matrix(data=NA, nrow=length(unique(mydata$tag)), ncol=length(unique(mydata$name)))
    for(i in 1:nrow(S)){
      for (j in 1:ncol(S)){
        foo <- which(mydata$tag == unique(mydata$tag)[i] & mydata$name == unique(mydata$name)[j])
        S[i,j] <- sum(mydata$value[foo])
      }
    }
}

colonel1 = function() dcast(mydata, name~tag, value.var='value', fun.aggregate=sum)

colonel2 = function() xtabs(value~name+tag, mydata)

#> system.time(colonel1())
#  user  system elapsed 
#   0.01    0.00    0.01 
#> system.time(colonel2())
#   user  system elapsed 
#   0.05    0.00    0.05 
#> system.time(funcPer())
#   user  system elapsed 
#   4.67    0.00    4.82