Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/76.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R data.table确定一个人是新的还是现有的_R_Data.table - Fatal编程技术网

R data.table确定一个人是新的还是现有的

R data.table确定一个人是新的还是现有的,r,data.table,R,Data.table,我有以下数据表 year Person Number_of_visits 2012 1 0 2013 1 4 2014 1 0 2015 1 1 2012 2 1 2013 2 5 ... 我想由每个人来决定他们第一次访问的年份。因此,期望的输出是: year Person Number_o

我有以下数据表

year      Person     Number_of_visits
2012      1          0
2013      1          4
2014      1          0
2015      1          1
2012      2          1
2013      2          5 
...
我想由每个人来决定他们第一次访问的年份。因此,期望的输出是:

year      Person     Number_of_visits    New?
2012      1          0                   NA
2013      1          4                   Yes
2014      1          0                   No
2015      1          1                   No
2012      2          1                   NA
2013      2          5                   No
我想也许可以在data.table中使用SHIFT函数,但我不知道怎么做。一旦有人来访,他/她就不再是新来的,即使可能在一年之后没有人来访。如果第一次访问发生在2012年,则应有NA或类似条目

我用过

test <- DT[ , NEW := c(0, (2:1)[(Number_of_visits== shift(Number_of_visits)) + 1][-1]), by = Person]

但这自然会给我所有的变化,我只想注册第一个从0到0访问次数以上的值的变化,我会将其分为以下步骤,我相信解决方案可以缩短到更短的时间

setorder(dt, Person, year) # Make sure the order is correct
dt[, New := "No"] # Set No as default
dt[dt[, .I[which.max(Number_of_visits > 0)], by = Person]$V1, New := "Yes"] # find first visits
dt[year == 2012, New := NA_character_] # Set NAs to 2012
dt
#    year Person Number_of_visits New
# 1: 2012      1                0  NA
# 2: 2013      1                4 Yes
# 3: 2014      1                0  No
# 4: 2015      1                1  No
# 5: 2012      2                1  NA
# 6: 2013      2                5  No

我将把它分成以下几个步骤,我相信这个解决方案可以用更短的时间

setorder(dt, Person, year) # Make sure the order is correct
dt[, New := "No"] # Set No as default
dt[dt[, .I[which.max(Number_of_visits > 0)], by = Person]$V1, New := "Yes"] # find first visits
dt[year == 2012, New := NA_character_] # Set NAs to 2012
dt
#    year Person Number_of_visits New
# 1: 2012      1                0  NA
# 2: 2013      1                4 Yes
# 3: 2014      1                0  No
# 4: 2015      1                1  No
# 5: 2012      2                1  NA
# 6: 2013      2                5  No

请使用data.table.test分享您的尝试。请使用data.table.test再次分享您的尝试。David和akrun的回答非常好。非常感谢你!大卫和阿克伦再次给出了非常好的回答。非常感谢你!