在for循环中进行循环时，是否有更简单、更快的方法？_R

在for循环中进行循环时，是否有更简单、更快的方法？

在for循环中进行循环时，是否有更简单、更快的方法？,r,R,我有以下数据集，由大约64000行组成： Trial.time Recording.time X.center Y.center Area Areachange Elongation Distance.moved Movement.Moving...Center.point. 2 300.030 0.000 -49.1651 31.9676 0.917085 0.65113 0.851349 -

我有以下数据集，由大约64000行组成：

    Trial.time Recording.time X.center Y.center  Area    Areachange Elongation   Distance.moved Movement.Moving...Center.point.
2      300.030          0.000 -49.1651  31.9676 0.917085    0.65113   0.851349              -                               -
22     300.696          0.666 -48.4404  31.9945 0.816206   0.715326   0.831207       0.725139                               1
24     300.763          0.733  -47.996  32.0696 0.834547   0.412688   0.856234       0.450784                               1
33     301.063          1.033 -47.6583  32.0598  0.75201   0.137563   0.716028       0.337775                               1
41     301.330          1.299 -47.3385  32.0139 0.843718   0.302638   0.838526       0.323117                               1
98     303.230          3.199 -47.3914  31.6981 0.944598    1.26558   0.847969        0.32022                               1
113    303.730          3.699 -47.3807  31.0614  0.86206    1.24724   0.761099       0.636771                               1
114    303.763          3.733 -47.1308  30.3858  1.00879     1.1005   0.809162        0.72036                               1
116    303.830          3.799 -47.1914  30.0551  1.01796   0.440201   0.831924       0.336155                               1

通常，它描述对象在特定时间的移动（距离.Moved）。如果两个连续行的Recording.time小于0.035，则两行都属于一个移动。相反，如果它更大，则时间点代表两个独立的运动。我的工作是确定每个移动的长度，因此有多少连续行给出一个移动以及移动中的总距离。我写了下面的代码，虽然可以工作，但是速度非常慢，我想问一下您是否知道如何提高速度

    time <- c()
j.final <- c()

#Go through all rows of the data.frame
for(i in 1:length(data2[,1])){
  i <- 1
  j <- 1
  if (!is.na(data2$Recording.time[i+1])){

    # As long as the distance between two consecutive time points is smaller than 0.035, increase the counter by one
    while (data2$Recording.time[i+1]-data2$Recording.time[i] <= 0.035){
      j <- j+1
      i <- i+1
    }
    # Save the number of consecutive time points
    j.final <- rbind(j.final,j)
    # Save the time of the last movement frame 
    time <- rbind(time,data2$Recording.time[j])
    # Delete the amount of rows that gave one single movement 
    data2 <- data2[-(1:j),]
  }
}   
final <- cbind(j.final,time)

#Same as above... Continouslz rows out of the data.frame
data2 <- data1
for (i in 1:length(j.final)){
  Dtotal <- sum(data2$Distance.moved[1:j.final[i]])
  distance <- rbind(distance, Dtotal)
  data2 <- data2[-(1:j.final[i]),]
}
final <- cbind(final,distance)
dimnames(final) <- list(NULL,c("Frames","Time","Distance"))
epicfinal <- as.data.frame(final)

正如zx8754所指出的，这很容易通过

lag

（或者更好地说，他在

数据表中的快速实现：shift
）和cumsum
函数实现。

我使用data.table
包来提高速度（请注意，语法与经典的data.frames
和data.table
非常不同，您可以在对表进行子集设置时将表达式放在j
参数中，而不是简单地在data.frames
中选择列）
库（data.table）
##变量创建：
#创建一列，指示两次观测之间的滞后
data$lag查看lead
，lag
，cumsum
功能。data$lag非常感谢您的详细回复。然而，要解决我的问题，你必须使用移位而不是滞后（data$lag@Urumpel你说得很对：shift
确实是lead
或lag
的一种更快的实现方式。因此，即使不强制使用它，我也只看到在这种情况下使用它的好处，因为我已经在使用数据。表
。我根据你的建议编辑了我的答案。@hellter我还有另一个问题stion.是否有一种简单的方法可以将移动的结束时间点（Recording.time）包含在最终表格中？是的，一旦创建了移动索引
，您就可以简单地将数据
子集为=移动索引
，并保留最后一行（“.N
”）在所需列的对应关系中。要执行此操作，需要在j
参数中使用.SD
。对于Recording.time，它应该如下所示：data[，.SD[.N，（（last\u Recording\u time=Recording.time）]，movement\u index]
  Frames  Time  Distance    velocity
1      1 0.033 0.0407652 0.001386017
2     18 0.666 1.4887506 0.911115367
3      3 0.799 0.0912680 0.009309336
4      7 1.066 0.3703880 0.088152344
5      2 1.166 0.0371303 0.002524860
6      3 1.299 0.1013617 0.010338893 

## VARIABLE CREATION:
data[,lag:=Recording.time-shift(Recording.time)][1,lag:=0L]
data[,newmovement:=lag<0.035]
data[,movement_index:=cumsum(newmovement)]