R 给定不同的开始和结束日期,找出每个变量的日平均值
我有不同开始和结束日期的数据R 给定不同的开始和结束日期,找出每个变量的日平均值,r,date,average,collapse,expand,R,Date,Average,Collapse,Expand,我有不同开始和结束日期的数据 mydata <- data.frame(id=c(1,2,3), start=c("2010/01/01","2010/01/01","2010/01/02"), end=c("2010/01/01","2010/01/05","2010/01/03"), a=c(140,750,56),b=c(48,25,36)) mydata i
mydata <- data.frame(id=c(1,2,3), start=c("2010/01/01","2010/01/01","2010/01/02"), end=c("2010/01/01","2010/01/05","2010/01/03"), a=c(140,750,56),b=c(48,25,36))
mydata
id start end a b
1 1 2010-01-01 2010-01-01 140 48
2 2 2010-01-01 2010-01-05 750 25
3 3 2010-01-02 2010-01-03 56 36
mydata这里有一个带有tidyverse
的选项。我们将“开始”列和“结束”列转换为带有ymd
(来自lubridate
)的类,为带有map2
的相应元素创建一个从“开始”到“结束”的日期的序列,通过将它们除以列表的长度列“日期”,对“a”、“b”进行变异,unnest
将“日期”按“日期”分组,得到“a”、“b”的和
library(dplyr)
library(tidyr)
library(lubridate)
library(purrr)
mydata %>%
mutate(across(c(start, end), ymd)) %>%
transmute(id, date = map2(start, end, seq, by = 'day'), a, b) %>%
mutate(across(c(a, b), ~ ./lengths(date))) %>%
unnest(date) %>%
group_by(date) %>%
summarise(across(c(a, b), sum, na.rm = TRUE))
# A tibble: 5 x 3
# date a b
# <date> <dbl> <dbl>
#1 2010-01-01 290 53
#2 2010-01-02 178 23
#3 2010-01-03 178 23
#4 2010-01-04 150 5
#5 2010-01-05 150 5
库(dplyr)
图书馆(tidyr)
图书馆(lubridate)
图书馆(purrr)
mydata%>%
突变(跨越(c(开始,结束),ymd))%>%
转换(id,日期=map2(开始,结束,顺序,日期='day'),a,b)%>%
变异(跨越(c(a,b),~./长度(日期)))%>%
unnest(日期)%%>%
分组单位(日期)%>%
总结(跨越(c(a,b),总和,na.rm=TRUE))
#一个tibble:5x3
#日期a b
#
#1 2010-01-01 290 53
#2 2010-01-02 178 23
#3 2010-01-03 178 23
#4 2010-01-04 150 5
#5 2010-01-05 150 5
我收到了这个错误:cross(c(start,end),ymd)中的错误:找不到函数“cross”有什么原因吗?@mazbata你能显示错误吗。我正在使用最新版本的dplyr
。如果您的dplyr版本是从dplyr
>=1.0.0引入的cross
,请更新您的dplyr版本。你能更新你的软件包版本吗
date a b
<date> <dbl> <dbl>
1 2010-01-01 290 53
2 2010-01-02 178 23
3 2010-01-03 178 23
4 2010-01-04 150 5
5 2010-01-05 150 5
library(dplyr)
library(tidyr)
library(lubridate)
library(purrr)
mydata %>%
mutate(across(c(start, end), ymd)) %>%
transmute(id, date = map2(start, end, seq, by = 'day'), a, b) %>%
mutate(across(c(a, b), ~ ./lengths(date))) %>%
unnest(date) %>%
group_by(date) %>%
summarise(across(c(a, b), sum, na.rm = TRUE))
# A tibble: 5 x 3
# date a b
# <date> <dbl> <dbl>
#1 2010-01-01 290 53
#2 2010-01-02 178 23
#3 2010-01-03 178 23
#4 2010-01-04 150 5
#5 2010-01-05 150 5