用R中的NA替换某个序列?

用R中的NA替换某个序列?,r,replace,dataframe,R,Replace,Dataframe,我在表单中有一个数据框 Set_1 Set_2 Set_3 Set_4 Set_5 Set_6 Set_7 abc89 abc62 4:5 abc513 abc512 abc81 abc10 abc6 pop abc11 abc4 giant 1:3 abc15 abc90 abc16 abc123 abc33

我在表单中有一个数据框

Set_1     Set_2     Set_3     Set_4     Set_5     Set_6     Set_7
abc89     abc62     4:5       abc513    abc512    abc81     abc10
abc6      pop       abc11     abc4      giant     1:3        abc15
abc90     abc16     abc123    abc33     abc22     abc08     9
11:1      abc15     abc72     abc36     abc57     abc9      abc55

我想把任何以“abc”开头的单元格改为NA。我还想把任何有结肠的细胞变成NA。我希望我的输出是data.frame。如何在R中轻松实现这一点?

您可以使用
grep
获取以
abc
开头的元素的索引,并通过在列中循环(
lappy
)来替换它

df1[] <- lapply(df1, function(x) replace(x, grep('^abc', x), NA))
df1
#  Row1 Row2 Row3 Row4  Row5 Row6 Row7
#1 <NA> <NA>   45 <NA>  <NA> <NA> <NA>
#2 <NA>  pop <NA> <NA> giant   13 <NA>
#3 <NA> <NA> <NA> <NA>  <NA> <NA>    9
#4  111 <NA> <NA> <NA>  <NA> <NA> <NA>
通过使用
[]
,我们在替换列中的元素时保持了与原始数据集“df2”相同的结构

数据
df1还有一些类似于“87:55”和“7:132”的序列,基本上我想把任何有结肠的细胞也变成NA。想法?@Evan只需更新正则表达式即可搜索冒号:
“^abc |:”
@Evan根据Moix注释进行更新。@Moix感谢您的注释谢谢,输出是矩阵吗?
 df2[] <- lapply(df2, function(x) replace(x, grep('^abc|:', x), NA))
 is.data.frame(df2)
 #[1] TRUE
df1 <- structure(list(Row1 = c("abc89", "abc6", "abc90", "111"),
Row2 = c("abc62", 
"pop", "abc16", "abc15"), Row3 = c("45", "abc11", "abc123", "abc72"
), Row4 = c("abc513", "abc4", "abc33", "abc36"), Row5 = c("abc512", 
"giant", "abc22", "abc57"), Row6 = c("abc81", "13", "abc08", 
"abc9"), Row7 = c("abc10", "abc15", "9", "abc55")), .Names = c("Row1", 
"Row2", "Row3", "Row4", "Row5", "Row6", "Row7"), 
class = "data.frame",     row.names = c(NA, -4L))

 df2 <- structure(list(Set_1 = c("abc89", "abc6", "abc90", "11:1"), 
  Set_2 = c("abc62", 
 "pop", "abc16", "abc15"), Set_3 = c("4:5", "abc11", "abc123", 
 "abc72"), Set_4 = c("abc513", "abc4", "abc33", "abc36"),
  Set_5 = c("abc512", 
 "giant", "abc22", "abc57"), Set_6 = c("abc81", "1:3", "abc08", 
 "abc9"), Set_7 = c("abc10", "abc15", "9", "abc55")), .Names = c("Set_1", 
 "Set_2", "Set_3", "Set_4", "Set_5", "Set_6", "Set_7"),
  class = "data.frame", row.names = c(NA, -4L))