R数据表-按行应用/移位处理(保留或动态移位处理)
我想处理每一行的数据。假设我们已经收集了4天内两个“cyl”的“mpg”值。我想得出与日相关的最小mpg值) 原始数据 **天,共青团,mpg**R数据表-按行应用/移位处理(保留或动态移位处理),r,R,我想处理每一行的数据。假设我们已经收集了4天内两个“cyl”的“mpg”值。我想得出与日相关的最小mpg值) 原始数据 **天,共青团,mpg** 1,4,34.4 2,4,21.3 3,4,23.3 4,4,25.0 1,3,23.0 2,3,27.0 3,3,18.3 4,3,17.3 预期产量 **天,共青团,每分钟,每分钟** 1,4,34.4,34.4 2,4,21.3,21.3 3,4,23.3,21.3 4,4,25.0,21.3 1,3,23.0,23.0 2,3,27.0
- 1,4,34.4
- 2,4,21.3
- 3,4,23.3
- 4,4,25.0
- 1,3,23.0
- 2,3,27.0
- 3,3,18.3
- 4,3,17.3
- 1,4,34.4,34.4
- 2,4,21.3,21.3
- 3,4,23.3,21.3
- 4,4,25.0,21.3
- 1,3,23.0,23.0
- 2,3,27.0,23.0
- 3,3,18.3,18.3
- 4,3,17.3,17.3
是否有“矢量化”选项可用?还是传统的FOR循环将是唯一的选择?我更喜欢使用base R(数据帧或数据表)我们可以使用
cummin
library(dplyr)
df1 %>%
group_by(cyl) %>%
mutate(min_mpg = cummin(mpg))
# A tibble: 8 x 4
# Groups: cyl [2]
# day cyl mpg min_mpg
# <int> <int> <dbl> <dbl>
#1 1 4 34.4 34.4
#2 2 4 21.3 21.3
#3 3 4 23.3 21.3
#4 4 4 25 21.3
#5 1 3 23 23
#6 2 3 27 23
#7 3 3 18.3 18.3
#8 4 3 17.3 17.3
或使用
data.table
library(data.table)
setDT(df1)[, min_mpg := cummin(mpg), by = cyl][]
数据
df1
library(data.table)
setDT(df1)[, min_mpg := cummin(mpg), by = cyl][]
df1 <- structure(list(day = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), cyl = c(4L,
4L, 4L, 4L, 3L, 3L, 3L, 3L), mpg = c(34.4, 21.3, 23.3, 25, 23,
27, 18.3, 17.3)), class = "data.frame", row.names = c(NA, -8L
))