R data.table:第一次出现的行中的字段

R data.table:第一次出现的行中的字段,r,data.table,R,Data.table,我想为data.table分组探索一种优雅的单行程序解决方案 我有一个数据表如下: library(data.table) library(lubridate) dt.master <- data.table(user = c(1000, 1002, 2008, 3005, 1000, 1002, 1002), target = c(50000, 50004, 50501, 50001, 50000, 50000, 50004),

我想为data.table分组探索一种优雅的单行程序解决方案

我有一个数据表如下:

library(data.table)
library(lubridate)

dt.master <- data.table(user = c(1000, 1002, 2008, 3005, 1000, 1002, 1002),
                    target = c(50000, 50004, 50501, 50001, 50000, 50000, 50004),
                    channel = c("A", "B", "C", "A", "B", "A", "C"),
                    date = c(dmy("10/02/2018"), dmy("11/04/2018"), dmy("14/03/2018"), dmy("02/03/2018"), dmy("05/01/2018"), dmy("08/05/2018"), dmy("05/03/2018")))
dt.master[, first_channel := channel[which.min(date)], keyby=.(user, target)]
我想知道,对于每组用户,目标,第一次出现的频道,并将其添加到dt.master。这是:

   user target channel       date first_channel
1: 1000  50000       A 2018-02-10             B
2: 1000  50000       B 2018-01-05             B
3: 1002  50000       A 2018-05-08             A
4: 1002  50004       B 2018-04-11             C
5: 1002  50004       C 2018-03-05             C
6: 2008  50501       C 2018-03-14             C
7: 3005  50001       A 2018-03-02             A
目前,我分两步进行:

首先,我提取第一次出现的行

dt.result <- dt.master[dt.master[, .(first_interest = .I[which.min(date)]), by = c("user", "target")]$first_interest,]
有没有一种不合并的方法?我相信一定有办法修改第一行,但我找不到


非常感谢

您可以按组参照更新,如下所示:

library(data.table)
library(lubridate)

dt.master <- data.table(user = c(1000, 1002, 2008, 3005, 1000, 1002, 1002),
                    target = c(50000, 50004, 50501, 50001, 50000, 50000, 50004),
                    channel = c("A", "B", "C", "A", "B", "A", "C"),
                    date = c(dmy("10/02/2018"), dmy("11/04/2018"), dmy("14/03/2018"), dmy("02/03/2018"), dmy("05/01/2018"), dmy("08/05/2018"), dmy("05/03/2018")))
dt.master[,date:=as.character(date)]
dt.master[,date:=as.numeric(gsub("-","",date))]
dt.master<-dt.master[order(user,date)]
dt.master[,firt_occ:=channel[1],by=c("user")]
dt.master[, first_channel := channel[which.min(date)], keyby=.(user, target)]

表1中有一个用于排序的setorder函数place@DavidArenburg谢谢,我知道,我认为这个例子并不重要!