Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/64.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/google-sheets/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 汇总数据框中的行_R_Plyr - Fatal编程技术网

R 汇总数据框中的行

R 汇总数据框中的行,r,plyr,R,Plyr,是否有可能以更易读的方式使用aggregate或ddply获得相同的ret结果 time<-c("2013-08-05 15:44:19","2013-08-05 15:44:24","2013-08-05 15:45:19","2013-08-05 15:45:28") df<-data.frame(time=as.POSIXct(time),col2=c(1,2,2,2),col3=LETTERS[1:4]) mm<-split(df,df[,"col2"]) ret&l

是否有可能以更易读的方式使用aggregate或ddply获得相同的ret结果

time<-c("2013-08-05 15:44:19","2013-08-05 15:44:24","2013-08-05 15:45:19","2013-08-05 15:45:28")

df<-data.frame(time=as.POSIXct(time),col2=c(1,2,2,2),col3=LETTERS[1:4])
mm<-split(df,df[,"col2"])
ret<-lapply(mm, function(x){
              mt<-max(x[,"time"])
              idx<-x[,"time"]==mt
              x[idx,]
            }
           )
do.call("rbind",ret)
使用聚合:

用ddply

只是为了好玩,另一个基本解决方案使用lappy和split

更新

最后一个解决方案适用于您的更新

> do.call(rbind, lapply(with(df, split(df, col2)),
+                       function(x) x[which.max(x$time), ]))
                 time col2 col3
1 2013-08-05 15:44:19    1    A
2 2013-08-05 15:45:28    2    D
plyr:

使用data.table:


也许你可以用简单的英语解释一下你想要实现什么?Thx,如果我想在df中编辑更多的专栏,如何选择它们作为返回值?我怎样才能按时间顺序返回结果?你能用你评论中的信息更新你的问题并使其可复制吗?但现在我不确定max是否在合计计算时间列。~col2,FUN=max,data=df.+1用于data.table解决方案。在我对海量数据进行基准测试时,plyr似乎是最慢的。data.table是最快的。我知道data.table解决方案会让您满意:-
> ddply(df, .(col2), summarise, time=max(time))[, c(2,1)]
                 time col2
1 2013-08-05 15:44:19    1
2 2013-08-05 15:45:28    2
> do.call(rbind, lapply(with(df, split(df, col2)),
+                       function(x) x[which.max(x$time), ]))
                 time col2
1 2013-08-05 15:44:19    1
2 2013-08-05 15:45:28    2
> do.call(rbind, lapply(with(df, split(df, col2)),
+                       function(x) x[which.max(x$time), ]))
                 time col2 col3
1 2013-08-05 15:44:19    1    A
2 2013-08-05 15:45:28    2    D
R> ddply(df, "col2", summarize, time=max(time))
  col2                time
1    1 2013-08-05 15:44:19
2    2 2013-08-05 15:45:28
R> dt <- data.table(df, key="col2")
R> dt[,list(time=max(time)),by=col2]
   col2                time
1:    1 2013-08-05 15:44:19
2:    2 2013-08-05 15:45:28