将观测计数添加到R中带有data.table的聚合中
我有一张这样的桌子:将观测计数添加到R中带有data.table的聚合中,r,data.table,R,Data.table,我有一张这样的桌子: set.seed(1234) sportset<-data.table(sport=rep(c("football"),50), position=rep(c("f","m","d","gk","w"),c("10","7","11","13","9")), height=rnorm(10,180,13)) 我还想添加一个额外的列,其中包含每个聚合使用的观察数,如下所示: sport position height obs 1: footbal
set.seed(1234)
sportset<-data.table(sport=rep(c("football"),50),
position=rep(c("f","m","d","gk","w"),c("10","7","11","13","9")),
height=rnorm(10,180,13))
我还想添加一个额外的列,其中包含每个聚合使用的观察数,如下所示:
sport position height obs
1: football f 182.5153 10
2: football m 186.0845 7
3: football d 181.3569 13
4: football gk 181.5860 11
5: football w 182.4974 9
我是否需要将其链接到原始表达式,或者是否可以将其集成到聚合函数中?我将如何做到这一点?
谢谢我们可以使用
c
来连接列表
元素
sportset[,c(lapply(.SD,mean),list(obs = .N)),by=.(sport,position),.SDcols= "height"]
# sport position height obs
#1: football f 175.0190 10
#2: football m 176.6006 7
#3: football d 174.8258 11
#4: football gk 173.5069 13
#5: football w 176.2090 9
此外,如示例所示,如果只有一列,则不需要指定
.SDcols
并循环lappy
sportset[, .(height = mean(height), obs = .N), .(sport, position)]
@Kauber您可能想查看
?data.table
中的示例。这里介绍了一个非常类似的用法,DT[,c(.N,lappy(.SD,sum)),by=x]
。
sportset[, .(height = mean(height), obs = .N), .(sport, position)]