Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 按组基于上一行和下一行的值生成新变量_R - Fatal编程技术网

R 按组基于上一行和下一行的值生成新变量

R 按组基于上一行和下一行的值生成新变量,r,R,我使用多个受试者(id)的面板数据,并且有一个事件(first\u occurrence)发生在不同的日子。我的目标是创建一个新变量(result),该变量在第一次出现的前2天、第一次出现的前2天以及第一次出现的后2天为1 以下是一个示例,其中包括样本数据和所需输出: data <- structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3), day = c(0, 1, 2,

我使用多个受试者(
id
)的面板数据,并且有一个事件(
first\u occurrence
)发生在不同的日子。我的目标是创建一个新变量(
result
),该变量在
第一次出现的前2天、第一次出现的前2天以及第一次出现的后2天为1

以下是一个示例,其中包括样本数据和所需输出:

data <- structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 
2, 3, 3, 3, 3, 3, 3, 3), day = c(0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 
2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6), first_occurrence = c(0, 0, 
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1), desired_output = c(1, 
1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1)), .Names = c("id", 
"day", "first_occurrence", "desired_output"), row.names = c(NA, 
-21L), class = "data.frame")

这里有一条路。您可以使用
ave
按组查找,然后使用
which.max
查找第一个匹配项,然后计算所有其他值与该值的距离

close<-(with(data, ave(first_occurrence, id, FUN=function(x) 
    abs(seq_along(x)-which.max(x)))
)<=2)+0
这就给了

   id day first_occurrence desired_output close
1   1   0                0              1     1
2   1   1                0              1     1
3   1   2                1              1     1
4   1   3                0              1     1
5   1   4                0              1     1
6   1   5                0              0     0
7   1   6                0              0     0
8   1   7                0              0     0
9   2   0                0              0     0
10  2   1                0              0     0
11  2   2                0              1     1
12  2   3                0              1     1
13  2   4                1              1     1
14  2   5                0              1     1
15  3   0                0              0     0
16  3   1                0              0     0
17  3   2                0              0     0
18  3   3                0              0     0
19  3   4                0              1     1
20  3   5                0              1     1
21  3   6                1              1     1

如所愿。请注意,此方法假定数据是按天排序的。

下面是另一种使用包
dplyr
的方法:

require(dplyr)        #install and load the package

data %.%
  arrange(id, day) %.%    # to sort the data by id and day. If it is already, you can remove this row
  group_by(id) %.%
  mutate(n = 1:n(),
         result = ifelse(abs(n - n[first_occurrence == 1]) <= 2, 1, 0)) %.%
  select(-n)

#   id day first_occurrence desired_output result
#1   1   0                0              1      1
#2   1   1                0              1      1
#3   1   2                1              1      1
#4   1   3                0              1      1
#5   1   4                0              1      1
#6   1   5                0              0      0
#7   1   6                0              0      0
#8   1   7                0              0      0
#9   2   0                0              0      0
#10  2   1                0              0      0
#11  2   2                0              1      1
#12  2   3                0              1      1
#13  2   4                1              1      1
#14  2   5                0              1      1
#15  3   0                0              0      0
#16  3   1                0              0      0
#17  3   2                0              0      0
#18  3   3                0              0      0
#19  3   4                0              1      1
#20  3   5                0              1      1
#21  3   6                1              1      1
非常有趣的方法——感谢对新dplyr包的全面描述和简洁的使用。
   id day first_occurrence desired_output close
1   1   0                0              1     1
2   1   1                0              1     1
3   1   2                1              1     1
4   1   3                0              1     1
5   1   4                0              1     1
6   1   5                0              0     0
7   1   6                0              0     0
8   1   7                0              0     0
9   2   0                0              0     0
10  2   1                0              0     0
11  2   2                0              1     1
12  2   3                0              1     1
13  2   4                1              1     1
14  2   5                0              1     1
15  3   0                0              0     0
16  3   1                0              0     0
17  3   2                0              0     0
18  3   3                0              0     0
19  3   4                0              1     1
20  3   5                0              1     1
21  3   6                1              1     1
require(dplyr)        #install and load the package

data %.%
  arrange(id, day) %.%    # to sort the data by id and day. If it is already, you can remove this row
  group_by(id) %.%
  mutate(n = 1:n(),
         result = ifelse(abs(n - n[first_occurrence == 1]) <= 2, 1, 0)) %.%
  select(-n)

#   id day first_occurrence desired_output result
#1   1   0                0              1      1
#2   1   1                0              1      1
#3   1   2                1              1      1
#4   1   3                0              1      1
#5   1   4                0              1      1
#6   1   5                0              0      0
#7   1   6                0              0      0
#8   1   7                0              0      0
#9   2   0                0              0      0
#10  2   1                0              0      0
#11  2   2                0              1      1
#12  2   3                0              1      1
#13  2   4                1              1      1
#14  2   5                0              1      1
#15  3   0                0              0      0
#16  3   1                0              0      0
#17  3   2                0              0      0
#18  3   3                0              0      0
#19  3   4                0              1      1
#20  3   5                0              1      1
#21  3   6                1              1      1
data %.%
  arrange(id, day) %.%    # to sort the data by id and day. If it is already, you can remove this row
  mutate(n = 1:n()) %.%
  group_by(id) %.%
  mutate(result = ifelse(abs(n - n[first_occurrence == 1]) <= 2, 1, 0)) %.%
  select(-n)