Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R data.table中的分组计数聚合_R_Count_Aggregate_Data.table - Fatal编程技术网

R data.table中的分组计数聚合

R data.table中的分组计数聚合,r,count,aggregate,data.table,R,Count,Aggregate,Data.table,包含日期、买入值和卖出值的表。我想数一数每天有多少买卖,以及买卖的总数。我发现在data.table中这样做有点棘手 date buy sell 2011-01-01 1 0 2011-01-02 0 0 2011-01-03 0 2 2011-01-04 3 0 2011-01-05 0 0 2011-01-06 0 0 2011-01-01 0 0 2011-01-02 0 1 2011-01-03 4 0 2011

包含日期、买入值和卖出值的表。我想数一数每天有多少买卖,以及买卖的总数。我发现在data.table中这样做有点棘手

   date   buy sell      
2011-01-01  1   0
2011-01-02  0   0
2011-01-03  0   2
2011-01-04  3   0
2011-01-05  0   0
2011-01-06  0   0
2011-01-01  0   0
2011-01-02  0   1
2011-01-03  4   0
2011-01-04  0   0
2011-01-05  0   0
2011-01-06  0   0
2011-01-01  0   0
2011-01-02  0   8
2011-01-03  2   0
2011-01-04  0   0
2011-01-05  0   0
2011-01-06  0   5
可以使用以下代码创建上述data.table:

 DT = data.table(
          date=rep(as.Date('2011-01-01')+0:5,3) , 
          buy=c(1,0,0,3,0,0,0,0,4,0,0,0,0,0,2,0,0,0),
          sell=c(0,0,2,0,0,0,0,1,0,0,0,0,0,8,0,0,0,5));
因此,我想要的是:

   date   total_buys   total_sells
2011-01-01    1            0
2011-01-02    0            2
                and so on  
此外,我还想知道买卖的总数:

 total_buys   total_sells
     4            4
我试过:

 length(DT[sell > 0 | buy > 0])
 > 3 

这是一个奇怪的答案(想知道为什么)

除了@Jake的答案外,还有一个典型的
melt
+
dcast
例程,类似于:

library(reshape2)
dtL <- melt(DT, id.vars = "date")
dcast.data.table(dtL, date ~ variable, value.var = "value", 
                 fun.aggregate = function(x) sum(x > 0))
#         date buy sell
# 1 2011-01-01   1    0
# 2 2011-01-02   0    2
# 3 2011-01-03   2    1
# 4 2011-01-04   1    0
# 5 2011-01-05   0    0
# 6 2011-01-06   0    1
要获取其他表格,请尝试:

dtL[, list(count = sum(value > 0)), by = variable]
#    variable count
# 1:      buy     4
# 2:     sell     4
或者,不熔化:

DT[, lapply(.SD, function(x) sum(x > 0)), .SDcols = c("buy", "sell")]
#    buy sell
# 1:   4    4

总和加上购买价值-我想数一数。总购买量和总销售量分别为4。谢谢,杰克。你能解释一下这是怎么回事吗?这是一个非常简洁的方法。@user1480926您对哪一部分感到困惑
buy>0
sell>0
返回一个
逻辑
,因此其总和就是非零计数。在
数据中使用
by
。表
可以让您轻松地按某个变量分组。@user1480926,我想我会分享它,因为如果您有更多的列而不仅仅是2列,这会更方便。
dtL[, list(count = sum(value > 0)), by = variable]
#    variable count
# 1:      buy     4
# 2:     sell     4
DT[, lapply(.SD, function(x) sum(x > 0)), .SDcols = c("buy", "sell")]
#    buy sell
# 1:   4    4