R：如何使用正则表达式重新排列字符串中的元素？_R_Regex_String

R：如何使用正则表达式重新排列字符串中的元素？

r regex string

R：如何使用正则表达式重新排列字符串中的元素？,r,regex,string,R,Regex,String,我有一个字符串，它编码了用逗号分隔的可变长度元素。例如：| A！BC、 =！A.B、 >！A.CA.C和% gsub（“（\\&.*？，）（\\\\.*，）（\\=.*，）（\\.*，）”，“\\1，\\2，\\3，\\4，\\5，”）不确定如何使用正则表达式执行此操作，因为这是一个复杂的搜索和替换场景。我最初的想法是和你一样分裂和调整。。我所提供的可能过于复杂，并没有真正解决你的问题。我通过考虑允许字符串“重新排序”的不同顺序来进行处理。。。因此，如果没有帮助，我道歉 #' The idea

我有一个字符串，它编码了用逗号分隔的可变长度元素。例如：

| A！BC、 =！A.B、 >！A.CA.C

和

%
gsub（“（\\&.*？，）（\\\\.*，）（\\=.*，）（\\.*，）”，“\\1，\\2，\\3，\\4，\\5，”）

不确定如何使用正则表达式执行此操作，因为这是一个复杂的

搜索和替换

场景。我最初的想法是和你一样分裂和调整。。我所提供的可能过于复杂，并没有真正解决你的问题。我通过考虑允许字符串“重新排序”的不同顺序来进行处理。。。因此，如果没有帮助，我道歉

#' The idea is to rearrange a string by providing an ordering rule.
#' @param strs The string containing the original data
#' @param rgx_order The individual characters to create the rule provided
#' in the order of the desired output
#' @param rgx_sprint This is the rule by which all ordered chars 
#' should abide, ie for this example "A punctuation or char followed by
#' anything and stopping at either a comma, or a line ending but not
#' including the seperator"
#' 
f <- function(strs = NULL, rgx_order = NULL, rgx_sprint = "\\%s(.*?)((?=,)|$)"){
        vrgx <- sprintf(rgx_sprint, rgx_order)
        fx <- function(str){
            stringi::stri_extract_all_regex(
                str, vrgx, omit_no_match = TRUE, simplify = TRUE
            ) %>% as.character() %>% .[mapply(nchar, .) > 0] %>% 
                stringi::stri_join(collapse = ",")
        }
        sapply(strs, fx, USE.NAMES = FALSE)
}

> chars <- c("|A!B!C,=!A!B,>!A!C,<A!C", "|A!B!C,%!B!C,%!BC,%A!B,&AB")
> new_order <-  c('&','|','=','<','>','%')

> f(strs = chars, rgx_order = new_order)
[1] "|A!B!C,=!A!B,<A!C,>!A!C"    "&AB,|A!B!C,%!B!C,%!BC,%A!B"

#的思想是通过提供排序规则来重新排列字符串。
#“@param strs包含原始数据的字符串
#“@param rgx_对各个字符进行排序，以创建提供的规则
#'按所需输出的顺序
#“@param rgx_sprint这是所有有序字符的规则
#'应该遵守，例如“标点符号或字符后跟
#'任何字符，并在逗号处或结尾的行处停止，但不是
#“包括分离器”
#' 
f 0]]>%
stringi:：stri_连接（collapse=“，”）
}
sapply（strs、fx、USE.NAMES=FALSE）
}
>字符新顺序f（strs=字符，rgx=新顺序）
[1] “|A！B！C，=！A！B，！A！C”“&AB，|A！B！C，%！B！C，%！BC，%A！B”

看起来正是我所希望的。我将根据我的解决方案对其进行基准测试。嗯，也许不是，不知道为什么它似乎不适用于

f（strs=“|ACB，%AB，| BA”，rgx_顺序=新顺序）

或

f（strs=“%AB#ACB，%AB#>1ac”，rgx_顺序=新顺序）

。

#' The idea is to rearrange a string by providing an ordering rule.
#' @param strs The string containing the original data
#' @param rgx_order The individual characters to create the rule provided
#' in the order of the desired output
#' @param rgx_sprint This is the rule by which all ordered chars 
#' should abide, ie for this example "A punctuation or char followed by
#' anything and stopping at either a comma, or a line ending but not
#' including the seperator"
#' 
f <- function(strs = NULL, rgx_order = NULL, rgx_sprint = "\\%s(.*?)((?=,)|$)"){
        vrgx <- sprintf(rgx_sprint, rgx_order)
        fx <- function(str){
            stringi::stri_extract_all_regex(
                str, vrgx, omit_no_match = TRUE, simplify = TRUE
            ) %>% as.character() %>% .[mapply(nchar, .) > 0] %>% 
                stringi::stri_join(collapse = ",")
        }
        sapply(strs, fx, USE.NAMES = FALSE)
}

> chars <- c("|A!B!C,=!A!B,>!A!C,<A!C", "|A!B!C,%!B!C,%!BC,%A!B,&AB")
> new_order <-  c('&','|','=','<','>','%')

> f(strs = chars, rgx_order = new_order)
[1] "|A!B!C,=!A!B,<A!C,>!A!C"    "&AB,|A!B!C,%!B!C,%!BC,%A!B"