Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/78.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/silverlight/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用R计算“过去就诊次数”列,显示当前就诊之前的预约次数_R_Count_Data.table - Fatal编程技术网

使用R计算“过去就诊次数”列,显示当前就诊之前的预约次数

使用R计算“过去就诊次数”列,显示当前就诊之前的预约次数,r,count,data.table,R,Count,Data.table,我有大量的历史客户预约记录(>100万条记录)。每行记录客人的ID、约会日期、约会状态(1表示已显示,0表示未显示)。下面显示了一个示例(测试) 我需要计算每位客人在当前预约之前的预约次数。我按升序日期排列数据 我尝试使用data.table()进行计算。计算结果的示例如下所示,即testWithVisit。我使用的方法对于小数据表非常有效。但对于结果超过10000个的数据表来说速度非常慢。我无法完成所有1 mil行的计算。我能知道是否有人有一个优雅的解决方案吗?提前谢谢 我用来生成测试数据和计

我有大量的历史客户预约记录(>100万条记录)。每行记录客人的ID、约会日期、约会状态(1表示已显示,0表示未显示)。下面显示了一个示例(测试)

我需要计算每位客人在当前预约之前的预约次数。我按升序日期排列数据

我尝试使用data.table()进行计算。计算结果的示例如下所示,即testWithVisit。我使用的方法对于小数据表非常有效。但对于结果超过10000个的数据表来说速度非常慢。我无法完成所有1 mil行的计算。我能知道是否有人有一个优雅的解决方案吗?提前谢谢

我用来生成测试数据和计算testWithVisit的代码位于底部

> test
      ID                Date Status Index
 1: 1002 2012-01-11 03:46:27      1     1
 2: 1001 2012-02-17 10:15:59      1     2
 3: 1002 2012-02-26 13:18:42      1     3
 4: 1001 2012-02-27 18:48:00      1     4
 5: 1004 2012-03-11 05:40:36      1     5
 6: 1004 2012-03-17 06:06:05      0     6
 7: 1008 2012-03-17 14:41:53      0     7
 8: 1008 2012-03-21 13:55:51      1     8
 9: 1008 2012-03-22 22:30:42      0     9
10: 1005 2012-03-29 09:00:39      1    10
11: 1005 2012-04-04 02:46:54      1    11
12: 1004 2012-04-05 22:53:05      1    12
13: 1006 2012-04-11 19:53:10      0    13
14: 1007 2012-04-14 17:19:07      1    14
15: 1003 2012-04-16 08:28:26      1    15
16: 1007 2012-04-16 19:26:57      1    16
17: 1001 2012-04-17 15:43:26      1    17
18: 1008 2012-04-21 07:12:20      0    18
19: 1004 2012-04-26 06:44:01      0    19
20: 1001 2012-05-10 13:17:56      1    20
21: 1005 2012-05-10 18:56:17      1    21
22: 1008 2012-05-11 08:58:28      1    22
23: 1001 2012-05-16 08:20:22      1    23
24: 1003 2012-06-06 04:15:58      1    24
25: 1006 2012-06-11 12:01:15      1    25
26: 1008 2012-06-20 14:06:22      1    26
27: 1002 2012-06-21 05:18:20      1    27
28: 1008 2012-06-29 16:07:28      0    28
29: 1002 2012-07-02 09:42:15      1    29
30: 1005 2012-07-06 22:45:24      1    30
31: 1007 2012-07-08 01:51:51      1    31
32: 1001 2012-08-12 07:04:49      1    32
33: 1006 2012-08-29 04:09:09      1    33
34: 1006 2012-09-25 19:37:58      0    34
35: 1003 2012-10-07 06:20:29      0    35
36: 1002 2012-10-08 19:16:35      0    36
37: 1001 2012-10-11 07:38:40      0    37
38: 1001 2012-10-24 10:58:16      0    38
39: 1005 2012-10-28 16:28:39      0    39
40: 1008 2012-10-30 01:57:52      1    40
41: 1006 2012-11-04 09:14:35      1    41
42: 1007 2012-11-11 10:56:59      0    42
43: 1008 2012-11-13 17:05:58      0    43
44: 1001 2012-11-17 08:38:36      1    44
45: 1005 2012-11-26 02:49:51      1    45
46: 1008 2012-11-26 06:12:53      0    46
47: 1005 2012-11-29 17:34:43      1    47
48: 1001 2012-11-29 23:25:36      0    48
49: 1006 2012-12-14 17:35:57      0    49
50: 1002 2012-12-19 08:36:07      1    50
      ID                Date Status Index
> testWithVisit
    Index   ID                Date Status Num_Visit Num_Show
 1:     1 1002 2012-01-11 03:46:27      1         0        0
 2:     2 1001 2012-02-17 10:15:59      1         0        0
 3:     3 1002 2012-02-26 13:18:42      1         1        1
 4:     4 1001 2012-02-27 18:48:00      1         1        1
 5:     5 1004 2012-03-11 05:40:36      1         0        0
 6:     6 1004 2012-03-17 06:06:05      0         1        1
 7:     7 1008 2012-03-17 14:41:53      0         0        0
 8:     8 1008 2012-03-21 13:55:51      1         1        0
 9:     9 1008 2012-03-22 22:30:42      0         2        1
10:    10 1005 2012-03-29 09:00:39      1         0        0
11:    11 1005 2012-04-04 02:46:54      1         1        1
12:    12 1004 2012-04-05 22:53:05      1         2        1
13:    13 1006 2012-04-11 19:53:10      0         0        0
14:    14 1007 2012-04-14 17:19:07      1         0        0
15:    15 1003 2012-04-16 08:28:26      1         0        0
16:    16 1007 2012-04-16 19:26:57      1         1        1
17:    17 1001 2012-04-17 15:43:26      1         2        2
18:    18 1008 2012-04-21 07:12:20      0         3        1
19:    19 1004 2012-04-26 06:44:01      0         3        2
20:    20 1001 2012-05-10 13:17:56      1         3        3
21:    21 1005 2012-05-10 18:56:17      1         2        2
22:    22 1008 2012-05-11 08:58:28      1         4        1
23:    23 1001 2012-05-16 08:20:22      1         4        4
24:    24 1003 2012-06-06 04:15:58      1         1        1
25:    25 1006 2012-06-11 12:01:15      1         1        0
26:    26 1008 2012-06-20 14:06:22      1         5        2
27:    27 1002 2012-06-21 05:18:20      1         2        2
28:    28 1008 2012-06-29 16:07:28      0         6        3
29:    29 1002 2012-07-02 09:42:15      1         3        3
30:    30 1005 2012-07-06 22:45:24      1         3        3
31:    31 1007 2012-07-08 01:51:51      1         2        2
32:    32 1001 2012-08-12 07:04:49      1         5        5
33:    33 1006 2012-08-29 04:09:09      1         2        1
34:    34 1006 2012-09-25 19:37:58      0         3        2
35:    35 1003 2012-10-07 06:20:29      0         2        2
36:    36 1002 2012-10-08 19:16:35      0         4        4
37:    37 1001 2012-10-11 07:38:40      0         6        6
38:    38 1001 2012-10-24 10:58:16      0         7        6
39:    39 1005 2012-10-28 16:28:39      0         4        4
40:    40 1008 2012-10-30 01:57:52      1         7        3
41:    41 1006 2012-11-04 09:14:35      1         4        2
42:    42 1007 2012-11-11 10:56:59      0         3        3
43:    43 1008 2012-11-13 17:05:58      0         8        4
44:    44 1001 2012-11-17 08:38:36      1         8        6
45:    45 1005 2012-11-26 02:49:51      1         5        4
46:    46 1008 2012-11-26 06:12:53      0         9        4
47:    47 1005 2012-11-29 17:34:43      1         6        5
48:    48 1001 2012-11-29 23:25:36      0         9        7
49:    49 1006 2012-12-14 17:35:57      0         5        3
50:    50 1002 2012-12-19 08:36:07      1         5        4
    Index   ID                Date Status Num_Visit Num_Show

#Generate test data.
test = data.frame(list(ID = sample(1001:1008, size = 50, replace = TRUE)))
test$Date = as.POSIXct(sample(as.POSIXct("2012-01-01"):as.POSIXct("2012-12-31"), size = 50, replace = FALSE), origin = as.POSIXct("1970-01-01"))
test$Status = sample(0:1, size = nrow(test), replace = TRUE, prob = c(0.4, 0.6))
test = test[order(test$Date), ]
test$Index = c(1:nrow(test))

#Compute Num_Visit and Num_Show
test = data.table(test)
setkey(test, "Index")
counts = test[, list(Num_Visit = length(Index[test$Index < Index & test$ID == ID]),
                       Num_Show = length(Index[test$Index < Index & test$ID == ID 
                                               & test$Status == 1])), 
           by = key(dt)]

testWithVisit = test[counts, ]
>测试
ID日期状态索引
1: 1002 2012-01-11 03:46:27      1     1
2: 1001 2012-02-17 10:15:59      1     2
3: 1002 2012-02-26 13:18:42      1     3
4: 1001 2012-02-27 18:48:00      1     4
5: 1004 2012-03-11 05:40:36      1     5
6: 1004 2012-03-17 06:06:05      0     6
7: 1008 2012-03-17 14:41:53      0     7
8: 1008 2012-03-21 13:55:51      1     8
9: 1008 2012-03-22 22:30:42      0     9
10: 1005 2012-03-29 09:00:39      1    10
11: 1005 2012-04-04 02:46:54      1    11
12: 1004 2012-04-05 22:53:05      1    12
13: 1006 2012-04-11 19:53:10      0    13
14: 1007 2012-04-14 17:19:07      1    14
15: 1003 2012-04-16 08:28:26      1    15
16: 1007 2012-04-16 19:26:57      1    16
17: 1001 2012-04-17 15:43:26      1    17
18: 1008 2012-04-21 07:12:20      0    18
19: 1004 2012-04-26 06:44:01      0    19
20: 1001 2012-05-10 13:17:56      1    20
21: 1005 2012-05-10 18:56:17      1    21
22: 1008 2012-05-11 08:58:28      1    22
23: 1001 2012-05-16 08:20:22      1    23
24: 1003 2012-06-06 04:15:58      1    24
25: 1006 2012-06-11 12:01:15      1    25
26: 1008 2012-06-20 14:06:22      1    26
27: 1002 2012-06-21 05:18:20      1    27
28: 1008 2012-06-29 16:07:28      0    28
29: 1002 2012-07-02 09:42:15      1    29
30: 1005 2012-07-06 22:45:24      1    30
31: 1007 2012-07-08 01:51:51      1    31
32: 1001 2012-08-12 07:04:49      1    32
33: 1006 2012-08-29 04:09:09      1    33
34: 1006 2012-09-25 19:37:58      0    34
35: 1003 2012-10-07 06:20:29      0    35
36: 1002 2012-10-08 19:16:35      0    36
37: 1001 2012-10-11 07:38:40      0    37
38: 1001 2012-10-24 10:58:16      0    38
39: 1005 2012-10-28 16:28:39      0    39
40: 1008 2012-10-30 01:57:52      1    40
41: 1006 2012-11-04 09:14:35      1    41
42: 1007 2012-11-11 10:56:59      0    42
43: 1008 2012-11-13 17:05:58      0    43
44: 1001 2012-11-17 08:38:36      1    44
45: 1005 2012-11-26 02:49:51      1    45
46: 1008 2012-11-26 06:12:53      0    46
47: 1005 2012-11-29 17:34:43      1    47
48: 1001 2012-11-29 23:25:36      0    48
49: 1006 2012-12-14 17:35:57      0    49
50: 1002 2012-12-19 08:36:07      1    50
ID日期状态索引
>访问测试
索引ID日期状态编号访问编号显示
1:     1 1002 2012-01-11 03:46:27      1         0        0
2:     2 1001 2012-02-17 10:15:59      1         0        0
3:     3 1002 2012-02-26 13:18:42      1         1        1
4:     4 1001 2012-02-27 18:48:00      1         1        1
5:     5 1004 2012-03-11 05:40:36      1         0        0
6:     6 1004 2012-03-17 06:06:05      0         1        1
7:     7 1008 2012-03-17 14:41:53      0         0        0
8:     8 1008 2012-03-21 13:55:51      1         1        0
9:     9 1008 2012-03-22 22:30:42      0         2        1
10:    10 1005 2012-03-29 09:00:39      1         0        0
11:    11 1005 2012-04-04 02:46:54      1         1        1
12:    12 1004 2012-04-05 22:53:05      1         2        1
13:    13 1006 2012-04-11 19:53:10      0         0        0
14:    14 1007 2012-04-14 17:19:07      1         0        0
15:    15 1003 2012-04-16 08:28:26      1         0        0
16:    16 1007 2012-04-16 19:26:57      1         1        1
17:    17 1001 2012-04-17 15:43:26      1         2        2
18:    18 1008 2012-04-21 07:12:20      0         3        1
19:    19 1004 2012-04-26 06:44:01      0         3        2
20:    20 1001 2012-05-10 13:17:56      1         3        3
21:    21 1005 2012-05-10 18:56:17      1         2        2
22:    22 1008 2012-05-11 08:58:28      1         4        1
23:    23 1001 2012-05-16 08:20:22      1         4        4
24:    24 1003 2012-06-06 04:15:58      1         1        1
25:    25 1006 2012-06-11 12:01:15      1         1        0
26:    26 1008 2012-06-20 14:06:22      1         5        2
27:    27 1002 2012-06-21 05:18:20      1         2        2
28:    28 1008 2012-06-29 16:07:28      0         6        3
29:    29 1002 2012-07-02 09:42:15      1         3        3
30:    30 1005 2012-07-06 22:45:24      1         3        3
31:    31 1007 2012-07-08 01:51:51      1         2        2
32:    32 1001 2012-08-12 07:04:49      1         5        5
33:    33 1006 2012-08-29 04:09:09      1         2        1
34:    34 1006 2012-09-25 19:37:58      0         3        2
35:    35 1003 2012-10-07 06:20:29      0         2        2
36:    36 1002 2012-10-08 19:16:35      0         4        4
37:    37 1001 2012-10-11 07:38:40      0         6        6
38:    38 1001 2012-10-24 10:58:16      0         7        6
39:    39 1005 2012-10-28 16:28:39      0         4        4
40:    40 1008 2012-10-30 01:57:52      1         7        3
41:    41 1006 2012-11-04 09:14:35      1         4        2
42:    42 1007 2012-11-11 10:56:59      0         3        3
43:    43 1008 2012-11-13 17:05:58      0         8        4
44:    44 1001 2012-11-17 08:38:36      1         8        6
45:    45 1005 2012-11-26 02:49:51      1         5        4
Num_Visit = length(Index[test$Index < Index & test$ID == ID])
test[, list(Date = Date,
            Status = Status,
            Num_Visit = seq_along(.I) - 1,
            Num_Show  = cumsum(Status) -1),
     by=ID]
setkey(test, ID, Date)