Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/83.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何对R中的for循环进行矢量化_R_For Loop_Vectorization - Fatal编程技术网

如何对R中的for循环进行矢量化

如何对R中的for循环进行矢量化,r,for-loop,vectorization,R,For Loop,Vectorization,我正在尝试清理这段代码,想知道是否有人对如何在R中不使用循环运行这段代码有任何建议。我有一个名为data的数据集,包含100个变量和200000个观察值。我想做的是将每个观测值乘以一个特定的标量来扩展数据集,然后将数据组合在一起。最后,我需要一个包含800000个观察值(我要创建四个类别)和101个变量的数据集。这是我写的一个循环,它可以做到这一点,但是它效率很低,我想要更快、更高效的东西 datanew <- c() for (i in 1:51){ for (k in 1:6){

我正在尝试清理这段代码,想知道是否有人对如何在R中不使用循环运行这段代码有任何建议。我有一个名为data的数据集,包含100个变量和200000个观察值。我想做的是将每个观测值乘以一个特定的标量来扩展数据集,然后将数据组合在一起。最后,我需要一个包含800000个观察值(我要创建四个类别)和101个变量的数据集。这是我写的一个循环,它可以做到这一点,但是它效率很低,我想要更快、更高效的东西

datanew <- c()
for (i in 1:51){
  for (k in 1:6){
    for (m in 1:4){

      sub <- subset(data,data$var1==i & data$var2==k)

      sub[,4:(ncol(sub)-1)] <- filingstat0711[i,k,m]*sub[,4:(ncol(sub)-1)]

      sub$newvar <- m

      datanew <- rbind(datanew,sub)

    }
  }
}

datanew您可以尝试以下方法。请注意,我们将前两个for循环替换为对
mappy
的调用,将第三个for循环替换为对lappy的调用。 另外,我们正在创建两个向量,我们将组合它们进行向量化乘法

# create a table of the i-k index combinations using `expand.grid`
ixk <- expand.grid(i=1:51, k=1:6)

    # Take a look at what expand.grid does
    head(ixk, 60)


# create two vectors for multiplying against our dataframe subset
multpVec <- c(rep(c(0, 1), times=c(4, ncol(mydf)-4-1)), 0)
invVec   <- !multpVec

    # example of how we will use the vectors
    (multpVec * filingstat0711[1, 2, 1] + invVec)


# Instead of for loops, we can use mapply. 
newdf <- 
  mapply(function(i, k) 

    # The function that you are `mapply`ing is:
    # rbingd'ing a list of dataframes, which were subsetted by matching var1 & var2
    # and then multiplying by a value in filingstat
    do.call(rbind, 
        # iterating over m
        lapply(1:4, function(m)

          # the cbind is for adding the newvar=m, at the end of the subtable
          cbind(

            # we transpose twice: first the subset to multiply our vector. 
            # Then the result, to get back our orignal form
            t( t(subset(mydf, var1==i & mydf$var2==k)) * 
              (multpVec * filingstat0711[i,k,m] + invVec)), 

          # this is an argument to cbind
          "newvar"=m) 
    )), 

    # the two lists you are passing as arguments are the columns of the expanded grid
    ixk$i, ixk$k, SIMPLIFY=FALSE
  )

# flatten the data frame
newdf <- do.call(rbind, newdf)
#使用“expand.grid”创建i-k索引组合表`
ixk请帮助我们帮助您,(1)发布一些样本数据,(2)用文字说明您希望在这里完成的工作。还请注意,您不需要引用您正在
子集
中设置的data.frame。
# create a table of the i-k index combinations using `expand.grid`
ixk <- expand.grid(i=1:51, k=1:6)

    # Take a look at what expand.grid does
    head(ixk, 60)


# create two vectors for multiplying against our dataframe subset
multpVec <- c(rep(c(0, 1), times=c(4, ncol(mydf)-4-1)), 0)
invVec   <- !multpVec

    # example of how we will use the vectors
    (multpVec * filingstat0711[1, 2, 1] + invVec)


# Instead of for loops, we can use mapply. 
newdf <- 
  mapply(function(i, k) 

    # The function that you are `mapply`ing is:
    # rbingd'ing a list of dataframes, which were subsetted by matching var1 & var2
    # and then multiplying by a value in filingstat
    do.call(rbind, 
        # iterating over m
        lapply(1:4, function(m)

          # the cbind is for adding the newvar=m, at the end of the subtable
          cbind(

            # we transpose twice: first the subset to multiply our vector. 
            # Then the result, to get back our orignal form
            t( t(subset(mydf, var1==i & mydf$var2==k)) * 
              (multpVec * filingstat0711[i,k,m] + invVec)), 

          # this is an argument to cbind
          "newvar"=m) 
    )), 

    # the two lists you are passing as arguments are the columns of the expanded grid
    ixk$i, ixk$k, SIMPLIFY=FALSE
  )

# flatten the data frame
newdf <- do.call(rbind, newdf)