Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/string/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Regex 将字符拆分为多个单词_Regex_String_R_Split - Fatal编程技术网

Regex 将字符拆分为多个单词

Regex 将字符拆分为多个单词,regex,string,r,split,Regex,String,R,Split,我有以下特点: endvotes <- "Yes106No85EH2NT6ES0P1" 我知道如何拆分其中的每一个,例如: yes <- unlist(str_split(end_votes, "\\No"))[1] yes <- as.integer(unlist(str_split(yes, "Yes"))[2]) yes [1] 106 yes您可以使用像这样的正则表达式,每个匹配将在第一个捕获组中包含文本,在第二个捕获组中包含值: ([a-zA-Z]+)([0-9

我有以下特点:

endvotes <- "Yes106No85EH2NT6ES0P1"
我知道如何拆分其中的每一个,例如:

yes <- unlist(str_split(end_votes, "\\No"))[1]
yes <- as.integer(unlist(str_split(yes, "Yes"))[2])

yes
[1] 106

yes您可以使用像这样的正则表达式,每个匹配将在第一个捕获组中包含文本,在第二个捕获组中包含值:

([a-zA-Z]+)([0-9]+)
基本上,这会选择一组字母,然后是一组数字。括号是捕获组,这将允许您轻松检索所需的值


请参见

endvows您也可以尝试这个
regex

strsplit(endvotes, split = "(?<=[A-Za-z])(?=[0-9])|(?<=[0-9])(?=[A-Za-z])", perl = T)
## [[1]]
##  [1] "Yes" "106" "No"  "85"  "EH"  "2"   "NT"  "6"   "ES"  "0"   "P"   "1"  
##

strsplit(endvotes,split=“(?根本不需要使用正则表达式。请从
stringi
软件包中尝试此函数,该软件包按字符类(如数字、标点符号上的字母)拆分字符向量:

str
只是一个向量,
\p{N}
\p{L}
是要拆分的类(N表示数字,L表示字母)。
省略\u empty
以删除“”-空字符串

endvotes <- "Yes106No85EH2NT6ES0P1"

names <- strsplit(endvotes, "[[:digit:]]+")[[1]]
numbers <- strsplit(endvotes, "[[:alpha:]]+")[[1]][-1]

setNames(as.data.frame(t(as.numeric(numbers))), names)
#  Yes No EH NT ES P
#1 106 85  2  6  0 1
strsplit(endvotes, split = "(?<=[A-Za-z])(?=[0-9])|(?<=[0-9])(?=[A-Za-z])", perl = T)
## [[1]]
##  [1] "Yes" "106" "No"  "85"  "EH"  "2"   "NT"  "6"   "ES"  "0"   "P"   "1"  
##
S <- strsplit(endvotes, split = "(?<=[A-Za-z])(?=[0-9])|(?<=[0-9])(?=[A-Za-z])", perl = T)[[1]]
res <- data.frame(t(S[seq_along(S)%%2 == 0]))
names(res) <- t(S[seq_along(S)%%2 == 1])
res
##   Yes No EH NT ES P
## 1 106 85  2  6  0 1  
res <- data.frame(t(regmatches(endvotes, gregexpr("[0-9]+", endvotes))[[1]]))
names(res) <- t(regmatches(endvotes, gregexpr("[A-Za-z]+", endvotes))[[1]])
res
##   Yes No EH NT ES P
## 1 106 85  2  6  0 1
require(stringi)
stri_split_charclass(str=endvotes,"\\p{N}",omit_empty=T)[[1]]
## [1] "Yes" "No"  "EH"  "NT"  "ES"  "P"  
stri_split_charclass(str=endvotes,"\\p{L}",omit_empty=T)[[1]]
## [1] "106" "85"  "2"   "6"   "0"   "1"