R 如何在一个数据帧中合并多组行
我有一个这样的数据框R 如何在一个数据帧中合并多组行,r,dataframe,data.table,R,Dataframe,Data.table,我有一个这样的数据框 product_id view_count purchase_count 1 11 1 2 20 3 3 5 2 ... 我想将其转换为一个表,该表根据视图计数进行分组,并对购买计数求和(例如)一个时间间隔 view_count_range total_purchase_count 0-10 45 10-20
product_id view_count purchase_count
1 11 1
2 20 3
3 5 2
...
我想将其转换为一个表,该表根据视图计数进行分组,并对购买计数求和(例如)一个时间间隔
view_count_range total_purchase_count
0-10 45
10-20 65
这些视图计数范围的大小是固定的。如果您能给我一些关于如何对这样的范围进行分组的建议,我将不胜感激。
cut
是一个方便的工具。这里有一个方法:
#First make some data to work with
#I suggest you do this in the future as it makes it
#easier to provide you with assistance.
set.seed(10)
dat <- data.frame(product_id=1:15, view_count=sample(1:20, 15, replace=T),
purchase_count=sample(1:8, 15, replace=T))
dat #look at the data
#now we can use cut and aggregate by this new variable we just created
dat$view_count_range <- with(dat, cut(view_count, c(0, 10, 20)))
aggregate(purchase_count~view_count_range, dat, sum)
扩展Tyler的答案,从他的示例
dat
开始,您可能会发现在中编写这样的查询更容易、更快:
就这样。就一行。容易写,容易读
请注意,它将(10,20)组放在第一位。这是因为默认情况下,它保留每个组首先出现在数据中的顺序(此数据集中的第一个视图\u计数为11)。要对组进行排序,请将by
更改为keyby
:
> DT[, sum(purchase_count), keyby=cut(view_count,c(0,10,20))]
cut V1
[1,] (0,10] 39
[2,] (10,20] 31
要命名结果列,请执行以下操作:
> DT[,list( purchase_count = sum(purchase_count) ),
keyby=list( view_count_range = cut(view_count,c(0,10,20) ))]
view_count_range purchase_count
[1,] (0,10] 39
[2,] (10,20] 31
> DT[, sum(purchase_count), keyby=cut(view_count,c(0,10,20))]
cut V1
[1,] (0,10] 39
[2,] (10,20] 31
> DT[,list( purchase_count = sum(purchase_count) ),
keyby=list( view_count_range = cut(view_count,c(0,10,20) ))]
view_count_range purchase_count
[1,] (0,10] 39
[2,] (10,20] 31