R 根据更改阈值筛选/子集数据帧

R 根据更改阈值筛选/子集数据帧,r,subset,threshold,R,Subset,Threshold,我有以下数据框,其中包含多行角度变化的度值: 'data.frame': 712801 obs. of 4 variables: $ time_passed: int 1 2 3 4 5 6 7 8 9 10 ... $ dRoll : num 0.9798 -0.5099 -0.0974 -0.4985 0.1719 ... $ dPitch : num -0.175 -0.0655 0.0653 0.8907 -1.0893 ... $ dYaw

我有以下数据框,其中包含多行角度变化的度值:

'data.frame':   712801 obs. of  4 variables:
 $ time_passed: int  1 2 3 4 5 6 7 8 9 10 ...
 $ dRoll      : num  0.9798 -0.5099 -0.0974 -0.4985 0.1719 ...
 $ dPitch     : num  -0.175 -0.0655 0.0653 0.8907 -1.0893 ...
 $ dYaw       : num  0.33232 0.06875 -0.00573 0.59588 -0.55577 ...

> myData[1:20,]
time_passed       dRoll       dPitch      dYaw
       1          0.97975783 -0.17498131  0.332315521
       2         -0.50993244 -0.06548908  0.068754935
       3         -0.09740283  0.06531719 -0.005729578
       4         -0.49847328  0.89072019  0.595876107
       5          0.17188734 -1.08930736 -0.555769061
       6          0.68181978  0.36852645  0.492743704
       7          1.07143108  0.15206300 -0.635983153
       8         -1.43812407 -0.76638835 -0.509932438
       9          0.43544792  0.41241502  0.767763445
      10          0.25210143  0.61375239  0.509932438
      11          0.38961130  0.01203211 -0.360963411
      12          0.03437747 -0.29633377 -0.315126787
      13         -0.33804510 -0.40639896 -0.177616916
      14          0.68181978  0.32446600  0.435447924
      15         -1.12872686 -0.37752189 -0.275019742
      16          0.75057471  0.33907642  0.464095814
      17         -0.25783101  0.11310187  0.309397209
      18         -0.01718873 -0.13435860 -0.521391594
      19          0.12605071  0.12817066 -0.085943669
      20          0.02291831 -0.59856901 -0.120321137
我怎么写这样的东西

“如果后续负值(或正值)之和较小 超过我的阈值(例如,5°变化),然后将其从数据集中删除“

R码

我想将此标准应用于任何一行,因此
dRoll
dPitch
dYaw


在这种情况下,基于dRoll列应用,输出将是:

time_passed       dRoll       dPitch      dYaw
       1          0.97975783 -0.17498131  0.332315521
       5          0.17188734 -1.08930736 -0.555769061
       6          0.68181978  0.36852645  0.492743704
       7          1.07143108  0.15206300 -0.635983153
       9          0.43544792  0.41241502  0.767763445
      10          0.25210143  0.61375239  0.509932438
      11          0.38961130  0.01203211 -0.360963411
      12          0.03437747 -0.29633377 -0.315126787
      14          0.68181978  0.32446600  0.435447924
      16          0.75057471  0.33907642  0.464095814
      19          0.12605071  0.12817066 -0.085943669
      20          0.02291831 -0.59856901 -0.120321137
dRoll中的所有负运行都被抛出,因为随后的负值总和小于5度:

  • 第一个负运行dRoll:
    sum(myData[2:4,2])
    =
    -1.105809
  • 第二次、第三次和第四次运行只是一个数字:
    -1.43812
    -0.33804
    -1.12872
  • dRoll中最后一次运行:
    sum(myData[17:18,2])
    =
    -0.2750197

在R中如何做到这一点?

我的建议是首先将数据帧融合为长格式。之后,您可以更轻松地执行分组操作

使用
data.table
包(我们需要
melt
rleid
函数):


您可以发布您想要的输出吗?您刚刚在dRoll中过滤掉了带有负值的行。也许你可以详细说明一下,例如,通过逐步计算?@M.D,我试过这样做,我希望现在我想做的更清楚。关键是,如果其中一个负运行的总和超过了我的阈值,它就必须留在数据帧中。谢谢,这就是我需要的!
# load the package
library(data.table)

# melt into long format
DT2 <- melt(DT, id = 'time_passed')

# create a cummulative sum for each run
# 'rleid(value > 0)' creates a grouping variable for runs of consecutive positive/negative values
# by adding '[.N]' to 'cumsum(value)' you set all values in 'csum' to the highest value
# for each run, which we can use to filter the data
DT2[, csum := cumsum(value)[.N], by = .(variable, rleid(value > 0))]

# filter the data according to a rule
# in this case only the values between -1.2 and -0.2 are filtered out
DT2[csum < -1.2 | csum > -0.2]
    time_passed variable        value         csum
 1:           1    dRoll  0.979757830  0.979757830
 2:           5    dRoll  0.171887340  1.925138200
 3:           6    dRoll  0.681819780  1.925138200
 4:           7    dRoll  1.071431080  1.925138200
 5:           8    dRoll -1.438124070 -1.438124070
 6:           9    dRoll  0.435447920  1.111538120
....
....
14:           3   dPitch  0.065317190  0.956037380
15:           4   dPitch  0.890720190  0.956037380
16:           6   dPitch  0.368526450  0.520589450
17:           7   dPitch  0.152063000  0.520589450
18:           9   dPitch  0.412415020  1.038199520
19:          10   dPitch  0.613752390  1.038199520
....
....
26:           1     dYaw  0.332315521  0.401070456
27:           2     dYaw  0.068754935  0.401070456
28:           3     dYaw -0.005729578 -0.005729578
29:           4     dYaw  0.595876107  0.595876107
30:           6     dYaw  0.492743704  0.492743704
31:           9     dYaw  0.767763445  1.277695883