Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何在一个数据帧中合并多组行_R_Dataframe_Data.table - Fatal编程技术网

R 如何在一个数据帧中合并多组行

R 如何在一个数据帧中合并多组行,r,dataframe,data.table,R,Dataframe,Data.table,我有一个这样的数据框 product_id view_count purchase_count 1 11 1 2 20 3 3 5 2 ... 我想将其转换为一个表,该表根据视图计数进行分组,并对购买计数求和(例如)一个时间间隔 view_count_range total_purchase_count 0-10 45 10-20

我有一个这样的数据框

product_id view_count purchase_count
1           11         1   
2           20         3
3           5          2
...
我想将其转换为一个表,该表根据视图计数进行分组,并对购买计数求和(例如)一个时间间隔

view_count_range total_purchase_count
0-10                 45
10-20                65

这些视图计数范围的大小是固定的。如果您能给我一些关于如何对这样的范围进行分组的建议,我将不胜感激。

cut
是一个方便的工具。这里有一个方法:

#First make some data to work with 
#I suggest you do this in the future as it makes it 
#easier to provide you with assistance.
set.seed(10)
dat <- data.frame(product_id=1:15, view_count=sample(1:20, 15, replace=T), 
    purchase_count=sample(1:8, 15, replace=T))
dat   #look at the data

#now we can use cut and aggregate by this new variable we just created
dat$view_count_range <- with(dat, cut(view_count, c(0, 10, 20)))
aggregate(purchase_count~view_count_range, dat, sum)

扩展Tyler的答案,从他的示例
dat
开始,您可能会发现在中编写这样的查询更容易、更快:

就这样。就一行。容易写,容易读

请注意,它将(10,20)组放在第一位。这是因为默认情况下,它保留每个组首先出现在数据中的顺序(此数据集中的第一个
视图\u计数为11)。要对组进行排序,请将
by
更改为
keyby

> DT[, sum(purchase_count), keyby=cut(view_count,c(0,10,20))]
         cut V1
[1,]  (0,10] 39
[2,] (10,20] 31
要命名结果列,请执行以下操作:

> DT[,list( purchase_count = sum(purchase_count) ),
     keyby=list( view_count_range = cut(view_count,c(0,10,20) ))]
     view_count_range purchase_count
[1,]           (0,10]             39
[2,]          (10,20]             31
> DT[, sum(purchase_count), keyby=cut(view_count,c(0,10,20))]
         cut V1
[1,]  (0,10] 39
[2,] (10,20] 31
> DT[,list( purchase_count = sum(purchase_count) ),
     keyby=list( view_count_range = cut(view_count,c(0,10,20) ))]
     view_count_range purchase_count
[1,]           (0,10]             39
[2,]          (10,20]             31