R 如何分别使用每列的总和、平均值和计数来聚合数据?
我有一个名为“order_product”的数据集,如下所示:R 如何分别使用每列的总和、平均值和计数来聚合数据?,r,aggregate,R,Aggregate,我有一个名为“order_product”的数据集,如下所示: order_id product order_sequence reorder 1 egg 1 1 1 meat 2 0 1 fruit 3 1 1 meat 4 1 2
order_id product order_sequence reorder
1 egg 1 1
1 meat 2 0
1 fruit 3 1
1 meat 4 1
2 egg 1 1
2 egg 2 1
2 fruit 3 0
3 egg 1 0
3 fruit 2 1
3 fruit 3 1
product frequency reorder_rate mean_sequence
egg 4 3/4 5/4
meat 2 1/2 3
fruit 4 3/4 11/4
我将把数据聚合到一个新的数据框中,称为“产品”,它是按产品分组的。新聚合数据集的变量显示每个产品的总频率、再订购率和平均序列。每个变量的计算如下所示:
frequency: product count
reorder_rate: sum of reorder/frequency
mean_sequence: sum or order_sequence/frequency
所以结果应该是这样的:
order_id product order_sequence reorder
1 egg 1 1
1 meat 2 0
1 fruit 3 1
1 meat 4 1
2 egg 1 1
2 egg 2 1
2 fruit 3 0
3 egg 1 0
3 fruit 2 1
3 fruit 3 1
product frequency reorder_rate mean_sequence
egg 4 3/4 5/4
meat 2 1/2 3
fruit 4 3/4 11/4
有人能帮我做这个吗?我在package data.table中尝试了melt()函数,但不知道如何编写它 这种计算很容易使用
dplyr
library(dplyr)
df %>%
group_by(product) %>%
summarise(frequency = n(),
reorder_rate = sum(reorder)/frequency,
mean_sequence = sum(order_sequence)/frequency)
# A tibble: 3 x 4
# product frequency reorder_rate mean_sequence
# <fct> <int> <dbl> <dbl>
#1 egg 4 0.75 1.25
#2 fruit 4 0.75 2.75
#3 meat 2 0.5 3