R对每列使用特定函数将多行折叠为一行
考虑下一个数据集:R对每列使用特定函数将多行折叠为一行,r,data.table,R,Data.table,考虑下一个数据集: id <- c(1,1,1,2,2,2) col_a <- c(123,56,87,987,1003,10) col_b <- c(17,234,20,88,765,69) col_c <- c(45,90,543,NA,1,543) df <- data.frame(id,col_a,col_b,col_c) library(data.table) setDT(df) 需要像这样完成解决方案: df[, lapply(.SD, ???), b
id <- c(1,1,1,2,2,2)
col_a <- c(123,56,87,987,1003,10)
col_b <- c(17,234,20,88,765,69)
col_c <- c(45,90,543,NA,1,543)
df <- data.frame(id,col_a,col_b,col_c)
library(data.table)
setDT(df)
需要像这样完成解决方案:
df[, lapply(.SD, ???), by=id]
我们可以使用
Map
将每个函数应用于按“id”分组的相应列
df[, Map(function(x,y) get(x)(y, na.rm = TRUE),
setNames(c('min', 'median', 'max'),names(.SD)), .SD), by = id]
# id col_a col_b col_c
#1: 1 56 20 543
#2: 2 10 88 543
使用
tidyverse
可以执行以下操作:
library(tidyverse)
df %>%
group_by(id) %>%
mutate(col_a = min(col_a),
col_b = median(col_b),
col_c = max(col_c, na.rm = TRUE)) %>%
distinct()
其中:
# A tibble: 2 x 4
# Groups: id [2]
id col_a col_b col_c
<dbl> <dbl> <dbl> <dbl>
1 1 56 20 543
2 2 10 88 543
#一个tible:2 x 4
#组别:id[2]
标识栏a栏b栏c
1 1 56 20 543
2 2 10 88 543
# A tibble: 2 x 4
# Groups: id [2]
id col_a col_b col_c
<dbl> <dbl> <dbl> <dbl>
1 1 56 20 543
2 2 10 88 543