R 如何根据列名的部分字符串匹配从列中选择特定观察值_R_Regex_Dplyr

R 如何根据列名的部分字符串匹配从列中选择特定观察值

r regex

R 如何根据列名的部分字符串匹配从列中选择特定观察值,r,regex,dplyr,R,Regex,Dplyr,我的数据集包含大量以“dis…”开头的列列中的值为0（无疾病）或1（有疾病）。我想创建一个观察数据框架，其中1表示特定疾病，0表示其他所有疾病我尝试了以下方法： istroke <- filter(onlyCRP, dis_ep0009 == 1 & grep("dis_" == 0)) 我需要另一列，例如-根据这3列的某些条件（我实际上有29列“dis_u2;”列）：如果dis_ep0009==1，则IS==1（与任何其他“dis..”列上的0或1无关）如果dis_ep

我的数据集包含大量以“dis…”开头的列

列中的值为0（无疾病）或1（有疾病）。我想创建一个观察数据框架，其中1表示特定疾病，0表示其他所有疾病

我尝试了以下方法：

istroke <- filter(onlyCRP, dis_ep0009 == 1 & grep("dis_" == 0))

我需要另一列，例如-根据这3列的某些条件（我实际上有29列“dis_u2;”列）：

如果dis_ep0009==1，则IS==1（与任何其他“dis..”列上的0或1无关）

如果dis_ep0009==0和dis_epxxx==1，我想删除这些观察值

如果dis_ep0009==0和dis_epxxx==0，我想编码为IS==0

因此，生成的表应该如下所示：

dis_ep0009  dis_epxxx   dis_epxxx    IS
 0            0             0        0
 0            1             0        drop
 0            0             1        drop
 1            0             1        1
 0            0             0        0
 0            0             0        0
 1            1             1        1

我曾尝试过将过滤器（dplyr）与grep和ifelse语句配对，但无法理解其中的头绪。本质上，它应该是这样简单的东西（不是为了工作）：

istroke查看代码中的注释，并告诉我这是否是您想要的
specific_disease <- "dis_ep0009"
disease_cols <- grep("dis",names(onlyCRP),value=TRUE) # all columns containing "dis"
disease_cols <- setdiff(disease_cols,specific_disease) # all these columns except your specific disease
onlyCRP$any_other_disease <- apply(onlyCRP[,disease_cols]==1,1,any) # a Boolean column saying if there is another disease besides the possible specific one
onlyCRP[onlyCRP$specific_disease == 1 & !onlyCRP$any_other_disease,] # the subset where you'll have only your specific disease and no other

specific_disease查看代码中的注释，并告诉我这是否是您想要的
specific_disease <- "dis_ep0009"
disease_cols <- grep("dis",names(onlyCRP),value=TRUE) # all columns containing "dis"
disease_cols <- setdiff(disease_cols,specific_disease) # all these columns except your specific disease
onlyCRP$any_other_disease <- apply(onlyCRP[,disease_cols]==1,1,any) # a Boolean column saying if there is another disease besides the possible specific one
onlyCRP[onlyCRP$specific_disease == 1 & !onlyCRP$any_other_disease,] # the subset where you'll have only your specific disease and no other

specific_disease我想我需要进一步澄清：我希望在其他“dis..”列中为dis_ep0009和0编码的所有观察结果。额外的布尔值列似乎不适用于此目的。尽管如此，我用序列创建了一个df，它有0个观察值。我也希望有更简单的东西，最好使用基于dplyr的代码。谢谢。我想我需要进一步澄清：我希望在其他“dis..”列中为dis_ep0009和0编码的所有观察结果。额外的布尔值列似乎不适用于此目的。尽管如此，我用序列创建了一个df，它有0个观察值。我也希望有更简单的东西，最好使用基于dplyr的代码。谢谢
istroke <- filter(df, ifelse(dis_ep0009 == 1, 1, ifelse(dis_ep0009 == 0 & grep("dis_", names(df)) == 0, 0, ifelse(dis_ep0009 == 0 & grep("dis_", names(df)) == 1, drop())))

specific_disease <- "dis_ep0009"
disease_cols <- grep("dis",names(onlyCRP),value=TRUE) # all columns containing "dis"
disease_cols <- setdiff(disease_cols,specific_disease) # all these columns except your specific disease
onlyCRP$any_other_disease <- apply(onlyCRP[,disease_cols]==1,1,any) # a Boolean column saying if there is another disease besides the possible specific one
onlyCRP[onlyCRP$specific_disease == 1 & !onlyCRP$any_other_disease,] # the subset where you'll have only your specific disease and no other