R data.table按创建列分组

R data.table按创建列分组,r,data.table,R,Data.table,我不熟悉可怕的data.table软件包,遇到了一个问题,希望有一个简单的解决方案。我想筛选数据.table,向该数据.table添加一些列,并按该数据中的一些列进行分组。table包括我在j子句中创建的一列 如果我使用的是dplyr,它会是这样的: library(dplyr) mtcars %>% filter(vs == 1) %>% mutate(trans = ifelse(am == 1, "Manual", "Auto")) %>%

我不熟悉可怕的
data.table
软件包,遇到了一个问题,希望有一个简单的解决方案。我想筛选
数据.table
,向该
数据.table
添加一些列,并按该
数据中的一些列进行分组。table
包括我在
j
子句中创建的一列

如果我使用的是
dplyr
,它会是这样的:

library(dplyr)

mtcars %>% 
    filter(vs == 1) %>% 
    mutate(trans = ifelse(am == 1, "Manual", "Auto")) %>% 
    group_by(gear, carb, trans) %>% 
    summarise(num_cars = n(),
              avg_qsec = mean(qsec))

# A tibble: 6 x 5
# Groups:   gear, carb [?]
   gear  carb trans  num_cars avg_qsec
  <dbl> <dbl> <chr>     <int>    <dbl>
1     3     1 Auto          3     19.9
2     4     1 Manual        4     19.2
3     4     2 Auto          2     21.4
4     4     2 Manual        2     18.6
5     4     4 Auto          2     18.6
6     5     2 Manual        1     16.9
因此,我在
j
子句中创建的列不能用于
by
?如果我不尝试转换
am
列,它可以正常工作

dtmt[vs == 1, 
     .(num_cars = .N, 
       avg_qsec = mean(qsec)),
     by = list(gear, carb, am)]

   gear carb am num_cars avg_qsec
1:    4    1  1        4    19.22
2:    3    1  0        3    19.89
3:    4    2  0        2    21.45
4:    4    4  0        2    18.60
5:    4    2  1        2    18.56
6:    5    2  1        1    16.90

谢谢

我们在过滤“vs”为1的行后创建一列“trans”。然后,将其用作汇总的分组变量

dtmt[vs==1 # subset the rows
    ][, trans := c("Auto", "Manual")[(am==1)+1] # create trans
     ][, .(num_cars = .N, avg_qsec = mean(qsec)), by = .(gear, carb, trans)]

我们在过滤“vs”为1的行后创建一列“trans”。然后,将其用作汇总的分组变量

dtmt[vs==1 # subset the rows
    ][, trans := c("Auto", "Manual")[(am==1)+1] # create trans
     ][, .(num_cars = .N, avg_qsec = mean(qsec)), by = .(gear, carb, trans)]

可以在一个
[]
中完成所有操作:

as.data.table(mtcars)[
    vs == 1,
    .(num_cars = .N, avg_qsec = mean(qsec)),
    by = .(gear, carb, trans = ifelse(am == 1, "Manual", "Auto"))]

#    gear carb  trans num_cars avg_qsec
# 1:    4    1 Manual        4    19.22
# 2:    3    1   Auto        3    19.89
# 3:    4    2   Auto        2    21.45
# 4:    4    4   Auto        2    18.60
# 5:    4    2 Manual        2    18.56
# 6:    5    2 Manual        1    16.90

可以在一个
[]
中完成所有操作:

as.data.table(mtcars)[
    vs == 1,
    .(num_cars = .N, avg_qsec = mean(qsec)),
    by = .(gear, carb, trans = ifelse(am == 1, "Manual", "Auto"))]

#    gear carb  trans num_cars avg_qsec
# 1:    4    1 Manual        4    19.22
# 2:    3    1   Auto        3    19.89
# 3:    4    2   Auto        2    21.45
# 4:    4    4   Auto        2    18.60
# 5:    4    2 Manual        2    18.56
# 6:    5    2 Manual        1    16.90

您需要
dtmt[vs==1][,trans:=c(“自动”,“手动”)[(am==1)+1][,(num_cars=.N,avg_qsec=mean(qsec)),(gear,carb,trans)]
您需要
dtmt[vs==1][(am==1)+1][,(num cars=.N,avg_qsec=mean(qsec)),(gear,carb,trans)