String 使用R中的列表进行搜索和编码

String 使用R中的列表进行搜索和编码,string,r,list,dataframe,String,R,List,Dataframe,我有一个字符串“向量”的“列表”和一个字符串“data.frame”,如下所示 lst <- list( c("key", "parking", "velvet"), c("sumatra", "cap"), c("sled", "card"), c("notice", "piece", "page")) df <- c("key", "sumatra", "band", "cattle", "camp", "sled", "page", "wire", "key", "card

我有一个字符串“向量”的“列表”和一个字符串“data.frame”,如下所示

lst <- list( c("key", "parking", "velvet"), c("sumatra", "cap"), c("sled", "card"), c("notice", "piece", "page"))

df <-  c("key", "sumatra", "band", "cattle", "camp", "sled", "page", "wire", "key", "card", "cap", "page")
df <- data.frame(df, stringsAsFactors=FALSE)

我在R怎么做

我假设您从以下代码开始:

 MyCode <- c("G1", "G2","G3", "G4", "G1", "G3", "G2", "G4")
但是您需要知道将它们放在哪一行。试试这个:

df$code<-NA
df[df$df %in% unlist(lst),]$code<-MyCode

未列出的部分将把列表变成一个向量。%in%部分将返回df$df与lst中的内容匹配的任何行。如果没有匹配项,df$code下将出现NA。

我假设您从以下代码开始:

 MyCode <- c("G1", "G2","G3", "G4", "G1", "G3", "G2", "G4")
但是您需要知道将它们放在哪一行。试试这个:

df$code<-NA
df[df$df %in% unlist(lst),]$code<-MyCode
未列出的部分将把列表变成一个向量。%in%部分将返回df$df与lst中的内容匹配的任何行。如果没有匹配项,df$code下将有NA。

这里有一种方法:

names(lst) <- paste0('G', seq_along(lst))
transform(df, code=with(stack(lst), ind[match(df, values)]))
#         df code
# 1      key   G1
# 2  sumatra   G2
# 3     band <NA>
# 4   cattle <NA>
# 5     camp <NA>
# 6     sled   G3
# 7     page   G4
# 8     wire <NA>
# 9      key   G1
# 10    card   G3
# 11     cap   G2
# 12    page   G4
这里有一个方法:

names(lst) <- paste0('G', seq_along(lst))
transform(df, code=with(stack(lst), ind[match(df, values)]))
#         df code
# 1      key   G1
# 2  sumatra   G2
# 3     band <NA>
# 4   cattle <NA>
# 5     camp <NA>
# 6     sled   G3
# 7     page   G4
# 8     wire <NA>
# 9      key   G1
# 10    card   G3
# 11     cap   G2
# 12    page   G4

以下是使用该软件包的方法:


以下是使用该软件包的方法:


还有一个很好的衡量标准

lst <- list(c("key", "parking", "velvet"), c("sumatra", "cap"), 
            c("sled", "card"), c("notice", "piece", "page"))
d <- c("key", "sumatra", "band", "cattle", "camp", 
        "sled", "page", "wire", "key", "card", "cap", "page")
DF <- data.frame(d, stringsAsFactors=FALSE)

还有一个很好的衡量标准

lst <- list(c("key", "parking", "velvet"), c("sumatra", "cap"), 
            c("sled", "card"), c("notice", "piece", "page"))
d <- c("key", "sumatra", "band", "cattle", "camp", 
        "sled", "page", "wire", "key", "card", "cap", "page")
DF <- data.frame(d, stringsAsFactors=FALSE)