R data.table按创建列分组
我不熟悉可怕的R data.table按创建列分组,r,data.table,R,Data.table,我不熟悉可怕的data.table软件包,遇到了一个问题,希望有一个简单的解决方案。我想筛选数据.table,向该数据.table添加一些列,并按该数据中的一些列进行分组。table包括我在j子句中创建的一列 如果我使用的是dplyr,它会是这样的: library(dplyr) mtcars %>% filter(vs == 1) %>% mutate(trans = ifelse(am == 1, "Manual", "Auto")) %>%
data.table
软件包,遇到了一个问题,希望有一个简单的解决方案。我想筛选数据.table
,向该数据.table
添加一些列,并按该数据中的一些列进行分组。table
包括我在j
子句中创建的一列
如果我使用的是dplyr
,它会是这样的:
library(dplyr)
mtcars %>%
filter(vs == 1) %>%
mutate(trans = ifelse(am == 1, "Manual", "Auto")) %>%
group_by(gear, carb, trans) %>%
summarise(num_cars = n(),
avg_qsec = mean(qsec))
# A tibble: 6 x 5
# Groups: gear, carb [?]
gear carb trans num_cars avg_qsec
<dbl> <dbl> <chr> <int> <dbl>
1 3 1 Auto 3 19.9
2 4 1 Manual 4 19.2
3 4 2 Auto 2 21.4
4 4 2 Manual 2 18.6
5 4 4 Auto 2 18.6
6 5 2 Manual 1 16.9
因此,我在j
子句中创建的列不能用于by
?如果我不尝试转换am
列,它可以正常工作
dtmt[vs == 1,
.(num_cars = .N,
avg_qsec = mean(qsec)),
by = list(gear, carb, am)]
gear carb am num_cars avg_qsec
1: 4 1 1 4 19.22
2: 3 1 0 3 19.89
3: 4 2 0 2 21.45
4: 4 4 0 2 18.60
5: 4 2 1 2 18.56
6: 5 2 1 1 16.90
谢谢 我们在过滤“vs”为1的行后创建一列“trans”。然后,将其用作汇总的分组变量
dtmt[vs==1 # subset the rows
][, trans := c("Auto", "Manual")[(am==1)+1] # create trans
][, .(num_cars = .N, avg_qsec = mean(qsec)), by = .(gear, carb, trans)]
我们在过滤“vs”为1的行后创建一列“trans”。然后,将其用作汇总的分组变量
dtmt[vs==1 # subset the rows
][, trans := c("Auto", "Manual")[(am==1)+1] # create trans
][, .(num_cars = .N, avg_qsec = mean(qsec)), by = .(gear, carb, trans)]
可以在一个
[]
中完成所有操作:
as.data.table(mtcars)[
vs == 1,
.(num_cars = .N, avg_qsec = mean(qsec)),
by = .(gear, carb, trans = ifelse(am == 1, "Manual", "Auto"))]
# gear carb trans num_cars avg_qsec
# 1: 4 1 Manual 4 19.22
# 2: 3 1 Auto 3 19.89
# 3: 4 2 Auto 2 21.45
# 4: 4 4 Auto 2 18.60
# 5: 4 2 Manual 2 18.56
# 6: 5 2 Manual 1 16.90
可以在一个
[]
中完成所有操作:
as.data.table(mtcars)[
vs == 1,
.(num_cars = .N, avg_qsec = mean(qsec)),
by = .(gear, carb, trans = ifelse(am == 1, "Manual", "Auto"))]
# gear carb trans num_cars avg_qsec
# 1: 4 1 Manual 4 19.22
# 2: 3 1 Auto 3 19.89
# 3: 4 2 Auto 2 21.45
# 4: 4 4 Auto 2 18.60
# 5: 4 2 Manual 2 18.56
# 6: 5 2 Manual 1 16.90
您需要
dtmt[vs==1][,trans:=c(“自动”,“手动”)[(am==1)+1][,(num_cars=.N,avg_qsec=mean(qsec)),(gear,carb,trans)]
您需要dtmt[vs==1][(am==1)+1][,(num cars=.N,avg_qsec=mean(qsec)),(gear,carb,trans)