使用R如何基于字符分隔字符串 我有一组字符串,我需要用中间有句点的词来搜索。一些字符串是连接在一起的,所以我需要将它们分解成单词,这样我就可以过滤带点的单词

使用R如何基于字符分隔字符串 我有一组字符串,我需要用中间有句点的词来搜索。一些字符串是连接在一起的,所以我需要将它们分解成单词,这样我就可以过滤带点的单词,r,gsub,regsub,R,Gsub,Regsub,下面是一个样本,我有什么,我得到了迄今为止 punctToRemove <- c("[^[:alnum:][:space:]._]") s <- c("get_degree('TITLE',PERS.ID)", "CLIENT_NEED.TYPE_CODe=21", "2.1.1Report Field Level Definition", "The user defined field. The user will valida

下面是一个样本,我有什么,我得到了迄今为止

 punctToRemove <- c("[^[:alnum:][:space:]._]")

 s <- c("get_degree('TITLE',PERS.ID)",
        "CLIENT_NEED.TYPE_CODe=21",
        "2.1.1Report Field Level Definition",
        "The user defined field. The user will validate")
下面是我想要的样品

[1] "get_degree ( ' TITLE ' , PERS.ID ) "          # spaces before and after the "(", "'", ",",and ")"
[2] "CLIENT_NEED.TYPE_CODe = 21"                   # spaces before and after the "=" sign. Dot and underscore remain untouched.        
[3] "2.1.1Report Field Level Definition"           # no changes 
[4] "The user defined field. The user will validate" # no changes
对于此示例:

   library(stringr)
    s <- str_replace_all(s, "\\)", " \\) ")
    s <- str_replace_all(s, "\\(", " \\( ")
    s <- str_replace_all(s, "=", " = ")
    s <- str_replace_all(s, "'", " ' ")
    s <- str_replace_all(s, ",", " , ")
库(stringr)

我们可以使用regex lookarounds

s1 <- gsub("(?<=['=(),])|(?=['(),=])", " ", s, perl = TRUE)
s1
#[1] "get_degree ( ' TITLE ' , PERS.ID ) "           
#[2] "CLIENT_NEED.TYPE_CODe = 21"                    
#[3] "2.1.1Report Field Level Definition"            
#[4] "The user defined field. The user will validate"

nchar(s1)
#[1] 35 26 34 46

s1我更新了代码,以适应有竖条的情况。。。比如下面的| | Client就是现在的
gsub(“(?)?
s1 <- gsub("(?<=['=(),])|(?=['(),=])", " ", s, perl = TRUE)
s1
#[1] "get_degree ( ' TITLE ' , PERS.ID ) "           
#[2] "CLIENT_NEED.TYPE_CODe = 21"                    
#[3] "2.1.1Report Field Level Definition"            
#[4] "The user defined field. The user will validate"

nchar(s1)
#[1] 35 26 34 46