Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/google-apps-script/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
循环总结R中大于主题的观察结果_R_Loops - Fatal编程技术网

循环总结R中大于主题的观察结果

循环总结R中大于主题的观察结果,r,loops,R,Loops,我有一个这样的数据集 set.seed(100) da <- data.frame(exp = c(rep("A", 4), rep("B", 4)), diam = runif(8, 10, 30)) 但是,我的实际数据集要大得多(>40000行,exp级别>100),因此循环速度非常慢我希望可以使用一些函数来简化计算。如果不需要结果中的初始顺序,您可以像这样高效地完成: library(data.table) setorder(setDT(da), exp, -diam) da[,

我有一个这样的数据集

set.seed(100)
da <- data.frame(exp = c(rep("A", 4), rep("B", 4)), diam = runif(8, 10, 30))

但是,我的实际数据集要大得多(>40000行,exp级别>100),因此循环速度非常慢我希望可以使用一些函数来简化计算。

如果不需要结果中的初始顺序,您可以像这样高效地完成:

library(data.table)
setorder(setDT(da), exp, -diam)
da[, d2 := cumsum(diam) - diam, by = exp]

da
#   exp     diam       d2
#1:   A 21.04645  0.00000
#2:   A 16.15532 21.04645
#3:   A 15.15345 37.20177
#4:   A 11.12766 52.35522
#5:   B 26.24805  0.00000
#6:   B 19.67541 26.24805
#7:   B 19.37099 45.92347
#8:   B 17.40641 65.29445
使用dplyr,这将是:

library(dplyr)
da %>%
  arrange(exp, desc(diam)) %>%
  group_by(exp) %>%
  mutate(d2 = cumsum(diam) - diam)

非常好的解决方案。您可能可以通过
exp
跳过排序,但是输出不会这么好。或者你可以用
keyby=exp
来代替。@DavidArenburg,谢谢,我考虑过了,但后来还是保持原样,我不认为这有什么大区别
library(data.table)
setorder(setDT(da), exp, -diam)
da[, d2 := cumsum(diam) - diam, by = exp]

da
#   exp     diam       d2
#1:   A 21.04645  0.00000
#2:   A 16.15532 21.04645
#3:   A 15.15345 37.20177
#4:   A 11.12766 52.35522
#5:   B 26.24805  0.00000
#6:   B 19.67541 26.24805
#7:   B 19.37099 45.92347
#8:   B 17.40641 65.29445
library(dplyr)
da %>%
  arrange(exp, desc(diam)) %>%
  group_by(exp) %>%
  mutate(d2 = cumsum(diam) - diam)