dplyr tidyr用于加宽和求和特定列
我正在努力使用dplyr和tidyr以这种形式获取df:dplyr tidyr用于加宽和求和特定列,r,dplyr,tidyr,tidyverse,R,Dplyr,Tidyr,Tidyverse,我正在努力使用dplyr和tidyr以这种形式获取df: myDf <- data.frame(id = c(1,1,1,1,2,2), event = c('a','b','a','b','a','b'), a_property = c(1,NA,2, NA, 3, NA), b_property = c(NA,2,NA, 3, NA, 4)) > myDf id e
myDf <- data.frame(id = c(1,1,1,1,2,2),
event = c('a','b','a','b','a','b'),
a_property = c(1,NA,2, NA, 3, NA),
b_property = c(NA,2,NA, 3, NA, 4))
> myDf
id event a_property b_property
1 a 1 NA
1 b NA 2
1 a 2 NA
1 b NA 3
2 a 3 NA
2 b NA 4
你的例子中有一个输入错误。答案应该是:
# A tibble: 2 × 5
id count_event_a count_event_b sum_property_a sum_property_b
<dbl> <int> <int> <dbl> <dbl>
1 1 2 2 3 5
2 2 1 1 3 4
#一个tible:2×5
id计数\事件\计数\事件\总和\属性\总和\属性
1 1 2 2 3 5
2 2 1 1 3 4
更一般一点:
myDf %>%
gather(key, value, -id, -event) %>%
filter(!is.na(value)) %>%
group_by(id, event) %>%
summarise(count = n(),
sum = sum(value)) %>%
gather(key, value, -id, -event) %>%
unite(measure, key, event) %>%
spread(measure, value)
做两个步骤。按照以下问题进行重塑:然后使用summary()获得计数/总和。
# A tibble: 2 × 5
id count_event_a count_event_b sum_property_a sum_property_b
<dbl> <int> <int> <dbl> <dbl>
1 1 2 2 3 5
2 2 1 1 3 4
myDf %>%
gather(key, value, -id, -event) %>%
filter(!is.na(value)) %>%
group_by(id, event) %>%
summarise(count = n(),
sum = sum(value)) %>%
gather(key, value, -id, -event) %>%
unite(measure, key, event) %>%
spread(measure, value)