R 计算按用户ID分组的多个变量之间的增量

R 计算按用户ID分组的多个变量之间的增量,r,dplyr,tidyr,R,Dplyr,Tidyr,如何计算“长”数据帧中按用户ID分组的多个变量之间的增量 数据格式: d1 <- data.frame( id = rep(c(1, 2, 3, 4, 5), each = 2), purchased = c(rep(c(T, F), 3), F, T, T, F), product = rep(c("A", "B"), 5), grade = c(1, 2, 1, 2, 2, 3, 7, 5, 1, 2), rate = c(10, 12, 1

如何计算“长”数据帧中按用户ID分组的多个变量之间的增量

数据格式:

d1 <- data.frame(
    id = rep(c(1, 2, 3, 4, 5), each = 2),
    purchased = c(rep(c(T, F), 3), F, T, T, F), 
    product = rep(c("A", "B"), 5), 
    grade = c(1, 2, 1, 2, 2, 3, 7, 5, 1, 2),
    rate = c(10, 12, 10, 12, 12, 14, 22, 18, 10, 12),
    fee = rep(c(1, 2), 5))

d1我们可以通过
收集/传播
来实现这一点。使用
collect
将数据从“宽”改为“长”,按“id”、“Var”分组,我们根据逻辑列“purchased”得到“产品”,得到“产品”的“Val”差值,即“B”和“A”,并
spread
将其从“长”改为“宽”格式

library(dplyr)
library(tidyr)
gather(d1, Var, Val, grade:fee) %>% 
           group_by(id, Var) %>% 
           summarise(purchased = product[purchased], 
                     Val = Val[product == 'B'] - Val[product == 'A'])%>% 
           spread(Var, Val)
#     id purchased   fee grade  rate
#   <dbl>    <fctr> <dbl> <dbl> <dbl>
#1     1         A     1     1     2
#2     2         A     1     1     2
#3     3         A     1     1     2
#4     4         B     1    -2    -4
#5     5         A     1     1     2
library(dplyr)
library(tidyr)
gather(d1, Var, Val, grade:fee) %>% 
           group_by(id, Var) %>% 
           summarise(purchased = product[purchased], 
                     Val = Val[product == 'B'] - Val[product == 'A'])%>% 
           spread(Var, Val)
#     id purchased   fee grade  rate
#   <dbl>    <fctr> <dbl> <dbl> <dbl>
#1     1         A     1     1     2
#2     2         A     1     1     2
#3     3         A     1     1     2
#4     4         B     1    -2    -4
#5     5         A     1     1     2
d3
#  id purchased dGrade dRate dFee
#1  1         A      1     2    1
#2  2         A      1     2    1
#3  3         A      1     2    1
#4  4         B     -2    -4    1
#5  5         A      1     2    1