混合ddply和interval时的错误
我一直在尝试计算个体的时间间隔,但遇到了一个奇怪的错误。具体而言,在本规范中:混合ddply和interval时的错误,r,R,我一直在尝试计算个体的时间间隔,但遇到了一个奇怪的错误。具体而言,在本规范中: library(lubridate) library(tidyverse) library(plyr) df<-tibble(dates=mdy(c("2/20/20","2/25/20","3/1/20","3/11/20","3/20/20")),recips=c("x","x&
library(lubridate)
library(tidyverse)
library(plyr)
df<-tibble(dates=mdy(c("2/20/20","2/25/20","3/1/20","3/11/20","3/20/20")),recips=c("x","x","a","a","a"),treatment=c("T","P","T","P","P"),eventtype=c("a","real","y","z","real"))
df%>%mutate(window=interval(start=dates,end=dates+weeks(2)))
ddply(df,.(recips),mutate,window=interval(start=dates,end=dates+weeks(2)))
库(lubridate)
图书馆(tidyverse)
图书馆(plyr)
df%变异(窗口=间隔(开始=日期,结束=日期+周(2)))
ddply(df,.(recips),变异,窗口=间隔(开始=日期,结束=日期+周(2)))
最后一行绘制的错误是倒数第二行未绘制的错误。任何提示?问题在于
间隔
输出的类
,它不符合ddply
。一个选项是使用as.character
转换为character
plyr::ddply(df, c("recips"), plyr::mutate,
window = as.character(interval(start = dates, end = dates + weeks(2))))
-输出
# dates recips treatment eventtype window
#1 2020-03-01 a T y 2020-03-01 UTC--2020-03-15 UTC
#2 2020-03-11 a P z 2020-03-11 UTC--2020-03-25 UTC
#3 2020-03-20 a P real 2020-03-20 UTC--2020-04-03 UTC
#4 2020-02-20 x T a 2020-02-20 UTC--2020-03-05 UTC
#5 2020-02-25 x P real 2020-02-25 UTC--2020-03-10 UTC
根据显示的数据,我们正在为“日期”的每个元素创建
间隔。因此,不需要执行groupby
操作
library(dplyr)
df %>%
mutate(window = interval(start=dates,end=dates+weeks(2)))
当来自dplyr
的mutate
起作用时,即df%>%groupby(recips)%%>%mutate(窗口=间隔(开始=日期,结束=日期+周(2))时,为什么需要ddply
)
适用于me@akrun我的印象是ddply比group_by()快%>%mutate()。我将把它扩展到100万行,这在group_by()中是缓慢的。更多的devel正在tidverse中发生。因此,我想tidyverse方法应该得到优化。我可能错了,我对这里的人有点好奇。根据显示的数据,您希望获得每个“日期”的间隔。那么,您为什么要分组plyr
早就退休了。所有数据操作任务都应使用dplyr
。