Str_提取问题:缺少模式
我有一个试图从样本数据集中提取的颜色列表。它似乎错过了一些颜色,却找到了其他颜色Str_提取问题:缺少模式,r,stringr,R,Stringr,我有一个试图从样本数据集中提取的颜色列表。它似乎错过了一些颜色,却找到了其他颜色 color_list <- c("gray", "brown", "green", "plum", "mist", "forest", "sienna", "grape", "ruby", "emerald", "copper", "silver", "gold", "blue") str_extract(df, fixed(color_list, ignore_case =
color_list <- c("gray", "brown", "green", "plum", "mist", "forest", "sienna", "grape", "ruby", "emerald", "copper",
"silver", "gold", "blue")
str_extract(df, fixed(color_list, ignore_case = TRUE))
[1] "GRAY" NA NA NA NA NA NA NA NA NA NA "silver" "GOLD" "blue"
是否也可以使用stru提取进行“模糊”匹配?由于数据中存在一些颜色拼写错误。以下代码将输出一个数据框,其中包含一列用于提取的颜色。我加入了tolower()函数,将示例更改为全小写。如果需要“模糊”匹配,可能需要研究正则表达式
examplestr_extract_all(df,paste(color_list,collapse=“|”)
@M-M当此示例数据中至少有(5)个结果时,此代码仅返回(3)个结果。这是因为它区分大小写。
structure(list(df = c("Tsilver flash mirror", "E:~ ADD FLASH FRONT MI",
"E:~", "E##T Color: G 15#3; MC", "E:~ ## PLEASE USE 8 BA", "E:~ ## blue flash ##",
"E:~ ## Silver Mirror #", "Ssilver mirror", "E:~ ## Treatment: Fee-",
"E:~Further Instruction", "E:~ ## FORREST GRAY Xp", "ESILVER",
"EGOLD")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13"))
example <- structure(list(df = c("Tsilver flash mirror", "E:~ ADD FLASH FRONT MI",
"E:~", "E##T Color: G 15#3; MC", "E:~ ## PLEASE USE 8 BA", "E:~ ## blue flash ##",
"E:~ ## Silver Mirror #", "Ssilver mirror", "E:~ ## Treatment: Fee-",
"E:~Further Instruction", "E:~ ## FORREST GRAY Xp", "ESILVER",
"EGOLD")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13"))
color_list <- c("gray", "brown", "green", "plum", "mist", "forest", "sienna", "grape", "ruby", "emerald", "copper",
"silver", "gold", "blue")
example %>%
mutate(extract = str_extract(tolower(df), paste(color_list, collapse = "|")))