R 如何将括号从字符串提取到新列中_R_Regex_Split

R 如何将括号从字符串提取到新列中

r regex

R 如何将括号从字符串提取到新列中,r,regex,split,R,Regex,Split,我需要将信息从字符串导出到不同的列中。更具体地说，字符串中括号的内容假设我有一根绳子 a <- "2xExp [K89; K96]; 1xExp [N-Term]; 2xNum [S87(100); S93(100)]" 我目前的做法是： pattern <- "(\\[.*?\\])" hits <- gregexpr(pattern, a) matches <- regmatches(a, hits) unlisted_matches <-

我需要将信息从字符串导出到不同的列中。更具体地说，字符串中括号的内容

假设我有一根绳子

a <- "2xExp [K89; K96]; 1xExp [N-Term]; 2xNum [S87(100); S93(100)]"

我目前的做法是：

  pattern <- "(\\[.*?\\])"
  hits <- gregexpr(pattern, a)
  matches <- regmatches(a, hits)
  unlisted_matches <- unlist(matches)

这确实给了我括号，但仍然没有分割条款。由于任何原因，我无法有效地分离“；”术语。

您可以使用

a <- "2xExp [K89; K96]; 1xExp [N-Term]; 2xNum [S87(100); S93(100)]"
pattern <- "(?:\\G(?!^)(?:\\([^()]*\\))?\\s*;\\s*|\\[)\\K[^][;()]+"
matches <- regmatches(a, gregexpr(pattern, a, perl=TRUE))
unlisted_matches <- paste0("[", unlist(matches),"]")
unlisted_matches
## => [1] "[K89]"    "[K96]"    "[N-Term]" "[S87]"    "[S93]"

a这里有一种使用tidyverse的方法：
a%
#在方括号、方括号、不保留方括号和取消列表之间提取
str_extract_all（（？通过多个步骤而不是一个monster regex模式来实现这一点在技术上和精神上都比较便宜。您当前的模式是一个很好的第一步。然而，我把它作为一个挑战，这是我能做的最好的（在合理的时间内）：使用PCRE\G
操作符（实际上，它是一个类似于^
或$）的锚点，在我的答案中使用一个类似regex的正则表达式就可以了。如果字符串的格式是一致的，那么使用一个类似regex的正则表达式就足够了。
  pattern <- "(\\[.*?\\])"
  hits <- gregexpr(pattern, a)
  matches <- regmatches(a, hits)
  unlisted_matches <- unlist(matches)

"[K89; K96]" "[N-Term]" "[S87(100); S93(100)]" 

a <- "2xExp [K89; K96]; 1xExp [N-Term]; 2xNum [S87(100); S93(100)]"
pattern <- "(?:\\G(?!^)(?:\\([^()]*\\))?\\s*;\\s*|\\[)\\K[^][;()]+"
matches <- regmatches(a, gregexpr(pattern, a, perl=TRUE))
unlisted_matches <- paste0("[", unlist(matches),"]")
unlisted_matches
## => [1] "[K89]"    "[K96]"    "[N-Term]" "[S87]"    "[S93]"