Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/83.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R从字符串中删除非字母数字符号_R_Regex - Fatal编程技术网

R从字符串中删除非字母数字符号

R从字符串中删除非字母数字符号,r,regex,R,Regex,我有一个字符串,我想删除所有非字母数字符号,然后放入一个向量。因此: "This is a string. In addition, this is a string!" 将成为: >stringVector1 "This","is","a","string","In","addition","this","is","a","string" 我已经查看了grep(),但找不到匹配的示例。有什么建议吗?下面是一个例子: > str <- "This is a strin

我有一个字符串,我想删除所有非字母数字符号,然后放入一个向量。因此:

"This is a string.  In addition, this is a string!" 
将成为:

>stringVector1

"This","is","a","string","In","addition","this","is","a","string"
我已经查看了
grep()
,但找不到匹配的示例。有什么建议吗?

下面是一个例子:

> str <- "This is a string. In addition, this is a string!"
> str
[1] "This is a string. In addition, this is a string!"
> strsplit(gsub("[^[:alnum:] ]", "", str), " +")[[1]]
 [1] "This"     "is"       "a"        "string"   "In"       "addition" "this"     "is"       "a"       
[10] "string"  
>str
[1] “这是一个字符串。此外,这是一个字符串!”
>strsplit(gsub(“[^[:alnum:][]”,“”,str),“+”[[1]]
[1] “此”“是”“添加”“中的”“字符串”“此”“是”“a”
[10] “字符串”

处理此问题的另一种方法

library(stringr)
text =  c("This is a string.  In addition, this is a string!")
str_split(str_squish((str_replace_all(text, regex("\\W+"), " "))), " ")
#[1] "This"     "is"       "a"        "string"   "In"       "addition" "this"     "is"       "a"        "string"  
  • str\u replace\u all(text,regex(“\\W+”)”)
    :查找非单词字符并替换
  • str_squish()
    :减少字符串中重复的空白
  • str\u split()
    :将字符串拆分为几段

我注意到,在正则表达式的结尾方括号之间有一个空格。那是用来干什么的?@B.M.W。它保留了字符串中的空格以便在存储库上拆分,最后,我不羞于在R
gsub(“[^[:alnum:=\\.]”,“,”哦,诸如此类诸如此类。请安静!=0.42”)
中使用正则表达式,这比多次使用
gsub()
函数将每个标点符号替换为
要好得多。