R中的非字母数字字符
对于大写、小写字母和10位数字,我可以生成包含所有字母或10位数字的向量,如下所示:R中的非字母数字字符,r,R,对于大写、小写字母和10位数字,我可以生成包含所有字母或10位数字的向量,如下所示: A <- LETTERS[0:26] B <- letters[0:26] C <- seq(0,9) A这可能很有用。ASCII字符集按类似类型的字符(字母等)排列 这有点冗长,可能有更好的网站(以及获得相同结果的更好方法),但是 库(XML);图书馆(RCurl) doc这个答案只是为了娱乐,列出你想要的字符,然后使用strsplit生成你的向量 > D <- strspli
A <- LETTERS[0:26]
B <- letters[0:26]
C <- seq(0,9)
A这可能很有用。ASCII字符集按类似类型的字符(字母等)排列
这有点冗长,可能有更好的网站(以及获得相同结果的更好方法),但是
库(XML);图书馆(RCurl)
doc这个答案只是为了娱乐,列出你想要的字符,然后使用strsplit
生成你的向量
> D <- strsplit('!"#$%&\'()*+,-./\\:;<=>?@[]^_`{|}~', '(?=.)', perl=T)[[1]]
## [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/"
## [16] "\\" ":" ";" "<" "=" ">" "?" "@" "[" "]" "^" "_" "`" "{" "|"
## [31] "}" "~"
这是另一种选择。生成所有ascii字符,然后用正则表达式过滤掉非标点符号
ascii <- rawToChar(as.raw(0:127), multiple=TRUE)
ascii[grepl('[[:punct:]]', ascii)]
# [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/" ":" ";" "<" "=" ">" "?" "@"
# [23] "[" "\\" "]" "^" "_" "`" "{" "|" "}" "~"
ascii Hi@RichardScriven,很抱歉我没有真正理解它。如果您需要所有ascii字符,rawToChar(as.raw(1:127),multiple=T)
应该可以使用。现在还不清楚你到底是如何选择你的名单的。有许多字符是不可打印的。另外,这取决于您的特定编码。扩展页面中可能会有更多的字符,UTF-8等编码定义了更多的字符代码。你到底想做什么?如果要将其中的几个字符存储在向量中,则需要对它们进行转义(使用“\\”)。我想要的是rawToChar(as.raw(c(32:47,58:64,91,93:96123:126)),multiple=T。
library(XML); library(RCurl)
doc <- htmlParse(getURL("https://wci.llnl.gov/codes/basis/manual/node161.html"))
xp <- xpathSApply(doc, "//tr/td", xmlValue, trim = TRUE)
xp[nzchar(xp) & nchar(xp) == 1]
# [1] "!" "[" "%" "," "]" "&" "-" "|" "'" "." "=" "~" "("
# [14] "/" ")" "*" "=" "{" "?" "`" "}" "@" ":" ";" "^" " "
> URL <- "http://datadebrief.blogspot.com/2011/03/ascii-code-table-in-r.html"
> r <- readLines(URL, warn = FALSE)[780:874]
> s <- sapply(strsplit(r, "\\s+"), "[", 1)
> s[!s %in% c(letters, LETTERS, 0:9)]
# [1] "" "!" "\"" "#" "$" "%" "&" "'" "("
# [10] ")" "*" "+" "," "-" "." "/" ":" ";"
# [19] "<" "=" ">" "?" "@" "[" "\\\\" "]" "^"
# [28] "_" "`" "{" "|" "}" "~"
> D <- strsplit('!"#$%&\'()*+,-./\\:;<=>?@[]^_`{|}~', '(?=.)', perl=T)[[1]]
## [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/"
## [16] "\\" ":" ";" "<" "=" ">" "?" "@" "[" "]" "^" "_" "`" "{" "|"
## [31] "}" "~"
> D <- gsub('[^\\pP\\pS]', '', rawToChar(as.raw(1:127), multiple=T), perl=T)
> D[D != ""]
## [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/"
## [16] ":" ";" "<" "=" ">" "?" "@" "[" "\\" "]" "^" "_" "`" "{" "|"
## [31] "}" "~"
ascii <- rawToChar(as.raw(0:127), multiple=TRUE)
ascii[grepl('[[:punct:]]', ascii)]
# [1] "!" "\"" "#" "$" "%" "&" "'" "(" ")" "*" "+" "," "-" "." "/" ":" ";" "<" "=" ">" "?" "@"
# [23] "[" "\\" "]" "^" "_" "`" "{" "|" "}" "~"