R 基于子组为组赋值_R - Fatal编程技术网

R 基于子组为组赋值

R 基于子组为组赋值,r,R,在R中，我有一个df，看起来有点像这样：结构( 名单( `家庭ID`=c（“1”、“1”、“1”、“2”、“2”、“3”、“3”、“3”、“3”、“4”、“4”、“4”、“4”）， `受试者ID`=c（“1”、“2”、“4”、“1”、“2”、“4”、“1”、“2”、“4”、“5”、“1”、“2”、“4”、“5”）， X=c（“1”、“2”、“1”、“1”、“2”、“2”、“2”、“1”、“2”、“1”、“2”、“1”、“1”、“2”、“2”、“2”）， Y=c（“1”、“2”、“2”、“1”、

在R中，我有一个df，看起来有点像这样：

结构(
名单(
`家庭ID`=c（“1”、“1”、“1”、“2”、“2”、“3”、“3”、“3”、“3”、“4”、“4”、“4”、“4”），
`受试者ID`=c（“1”、“2”、“4”、“1”、“2”、“4”、“1”、“2”、“4”、“5”、“1”、“2”、“4”、“5”），
X=c（“1”、“2”、“1”、“1”、“2”、“2”、“2”、“1”、“2”、“1”、“2”、“1”、“1”、“2”、“2”、“2”），
Y=c（“1”、“2”、“2”、“1”、“2”、“2”、“1”、“1”、“2”、“2”、“2”、“2”、“2”、“1”、“2”、“2”）
)，row.names=2:15，class=“data.frame”
)
#>家庭ID主体ID X Y
#> 2          1          1 1 1
#> 3          1          2 2 2
#> 4          1          4 1 2
#> 5          2          1 1 1
#> 6          2          2 2 2
#> 7          2          4 2 2
#> 8          3          1 2 1
#> 9          3          2 1 1
#> 10         3          4 2 2
#> 11         3          5 1 2
#> 12         4          1 1 2
#> 13         4          2 2 1
#> 14         4          4 2 2
#> 15         4          5 2 2

由（v0.3.0）于2021年4月15日创建

我的目标是为所有具有相同家庭ID的人创建一个包含值1的新列，当且仅当主题ID为4或5的人在列x或列y中包含值1时。因此，本例中的结果如下所示：

#>家庭ID主体ID X Y Z
#> 2          1          1 1 1 1
#> 3          1          2 2 2 1
#> 4          1          4 1 2 1
#> 5          2          1 1 1 0
#> 6          2          2 2 2 0
#> 7          2          4 2 2 0
#> 8          3          1 2 1 1
#> 9          3          2 1 1 1
#> 10         3          4 2 2 1
#> 11         3          5 1 2 1
#> 12         4          1 1 2 0
#> 13         4          2 2 1 0
#> 14         4          4 2 2 0
#> 15         4          5 2 2 0

由（v0.3.0）于2021年4月15日创建

这里的任何帮助都是感激的。提前向您道歉，因为我还不熟悉这一点。

按“FamilyID”分组后，将主语为4或5的“X”、“Y”列子集，检查

任何值是否等于1，并且复合逻辑表达式是否与or（|
）运算符联接
-输出
# A tibble: 13 x 5
#   FamilyID SubjectID     X     Y     Z
#      <int>     <int> <int> <int> <int>
# 1        1         1     1     1     1
# 2        1         2     2     2     1
# 3        1         4     1     2     1
# 4        2         1     1     1     0
# 5        2         2     2     2     0
# 6        3         1     2     1     1
# 7        3         2     1     1     1
# 8        3         4     2     2     1
# 9        3         5     1     2     1
#10        4         1     2     2     0
#11        4         2     2     2     0
#12        4         4     2     2     0
#13        4         5     2     2     0

数据
df1特别感谢亲爱的@akrun提供的有用建议：
您还可以使用以下解决方案。我使用了dear@akrun提供的数据
library(dplyr)
library(purrr)

df1 %>%
  mutate(Z = pmap_dbl(list(SubjectID, X, Y), ~ if_else(..1 %in% c(4, 5) & any(c(..2, ..3) == 1), 1, 0))) %>%
  group_by(FamilyID) %>%
  mutate(Z = if_else(any(Z == 1), 1, 0))

# A tibble: 13 x 5
# Groups:   FamilyID [4]
   FamilyID SubjectID     X     Y     Z
      <int>     <int> <int> <int> <dbl>
 1        1         1     1     1     1
 2        1         2     2     2     1
 3        1         4     1     2     1
 4        2         1     1     1     0
 5        2         2     2     2     0
 6        3         1     2     1     1
 7        3         2     1     1     1
 8        3         4     2     2     1
 9        3         5     1     2     1
10        4         1     2     2     0
11        4         2     2     2     0
12        4         4     2     2     0
13        4         5     2     2     0


库（dplyr）
图书馆（purrr）
df1%>%
变异（Z=pmap_dbl（list（SubjectID，X，Y），~if_else（..1%在%c（4，5）中）和任何（c（..2，..3）==1，1，0））%>%
分组依据（家庭ID）%>%
变异（Z=if_else（any（Z==1,1,0））
#一个tibble:13x5
#组别:FamilyID[4]
家庭ID主体X Y Z
1        1         1     1     1     1
2        1         2     2     2     1
3        1         4     1     2     1
4        2         1     1     1     0
5        2         2     2     2     0
6        3         1     2     1     1
7        3         2     1     1     1
8        3         4     2     2     1
9        3         5     1     2     1
10        4         1     2     2     0
11        4         2     2     2     0
12        4         4     2     2     0
13        4         5     2     2     0
akrun。非常感谢。亲爱的Arun，在最近的讨论之后，我一直在考虑使用pmap的行式解决方案，我想到了这个。
df1$Z <- with(df1, +(FamilyID %in% FamilyID[SubjectID %in% 
       4:5][rowSums(cbind(X, Y)[SubjectID %in% 4:5,] == 1) > 0]))
df1$Z
#[1] 1 1 1 0 0 1 1 1 1 0 0 0 0

df1 <- structure(list(FamilyID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 
4L, 4L, 4L, 4L), SubjectID = c(1L, 2L, 4L, 1L, 2L, 1L, 2L, 4L, 
5L, 1L, 2L, 4L, 5L), X = c(1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 
2L, 2L, 2L, 2L), Y = c(1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L)), class = "data.frame", row.names = c(NA, -13L))

library(dplyr)
library(purrr)

df1 %>%
  mutate(Z = pmap_dbl(list(SubjectID, X, Y), ~ if_else(..1 %in% c(4, 5) & any(c(..2, ..3) == 1), 1, 0))) %>%
  group_by(FamilyID) %>%
  mutate(Z = if_else(any(Z == 1), 1, 0))

# A tibble: 13 x 5
# Groups:   FamilyID [4]
   FamilyID SubjectID     X     Y     Z
      <int>     <int> <int> <int> <dbl>
 1        1         1     1     1     1
 2        1         2     2     2     1
 3        1         4     1     2     1
 4        2         1     1     1     0
 5        2         2     2     2     0
 6        3         1     2     1     1
 7        3         2     1     1     1
 8        3         4     2     2     1
 9        3         5     1     2     1
10        4         1     2     2     0
11        4         2     2     2     0
12        4         4     2     2     0
13        4         5     2     2     0