Regex 匹配平衡圆括号的正则表达式_Regex

Regex 匹配平衡圆括号的正则表达式

regex

Regex 匹配平衡圆括号的正则表达式,regex,Regex,我需要一个正则表达式来选择两个外括号之间的所有文本示例：一些文本（此处的文本（可能文本）文本（可能文本（更多文本）））结束文本 [^\(]*(\(.*\))[^\)]* 结果：（此处文本（可能文本）文本（可能文本（更多文本）） [^\（]*匹配字符串开头处不是左括号的所有内容，（\（.*）捕获括号中包含的所需子字符串，[^\）]*匹配字符串结尾处不是右括号的所有内容。请注意，此表达式不尝试匹配括号；一个简单的解析器（请参阅）更适合于此（？正则表达式对于该作业来说是错误的工具，因为您处理的是

我需要一个正则表达式来选择两个外括号之间的所有文本

示例：

一些文本（此处的文本（可能文本）文本（可能文本（更多文本）））结束文本

[^\(]*(\(.*\))[^\)]*

结果：

（此处文本（可能文本）文本（可能文本（更多文本））

[^\（]*

匹配字符串开头处不是左括号的所有内容，

（\（.*）

捕获括号中包含的所需子字符串，

[^\）]*

匹配字符串结尾处不是右括号的所有内容。请注意，此表达式不尝试匹配括号；一个简单的解析器（请参阅）更适合于此

（？正则表达式对于该作业来说是错误的工具，因为您处理的是嵌套结构，即递归
(?<=\().*(?=\))

但是有一个简单的算法可以做到这一点，我对a进行了更详细的描述。要点是编写代码，扫描字符串，保留一个开括号的计数器，该计数器尚未与右括号匹配。当该计数器返回到零时，您就知道已到达最后的右括号。
答案取决于是否需要匹配括号的匹配集，或者只需要匹配输入文本中从第一个打开到最后一个关闭的部分
\s*\w+[(][^+]*[)]\s*

如果您需要匹配嵌套括号，那么您需要的不仅仅是正则表达式。-请参阅
如果只是从第一次打开到最后一次关闭，请参见
决定您希望发生的事情：
abc ( 123 ( foobar ) def ) xyz ) ghij

在这种情况下，您需要确定代码需要匹配的内容。
实际上可以使用.NET正则表达式进行匹配，但这并不是一件小事，所以请仔细阅读
你可以读一篇好文章。你可能还需要阅读.NET正则表达式。你可以开始阅读了
之所以使用尖括号
，是因为它们不需要转义
正则表达式如下所示：
<
[^<>]*
(
    (
        (?<Open><)
        [^<>]*
    )+
    (
        (?<Close-Open>>)
        [^<>]*
    )+
)*
(?(Open)(?!))
>

<
[^]*
(
(
(?)
[^]*
)+
)*
（？（开放）（？！））
>
这是最终的正则表达式：
\(
(?<arguments> 
(  
  ([^\(\)']*) |  
  (\([^\(\)']*\)) |
  '(.*?)'

)*
)
\)

请注意，”（pip'作为字符串正确管理。
（在调节器中试用：）
使用Ruby（1.9.3或更高版本）的正则表达式：
/（？\（？：\g|[^（）]++）*\）/

您可以使用：
我编写了一个小JavaScript库来帮助完成这项任务
balanced.matches({
    source: source,
    open: '(',
    close: ')'
});

您甚至可以进行替换：
balanced.replacements({
    source: source,
    open: '(',
    close: ')',
    replace: function (source, head, tail) {
        return head + source + tail;
    }
});

这里有一个更复杂、更具交互性的示例。
我想添加此答案以供快速参考。请随时更新

.NET Regex使用
；或不作改动：
\((?:[^)(]*(?R)?)*+\)

；或就表现而言：
\([^)(]*+(?:(?R)[^)(]*)*+\)

；图案粘贴在（？R）
处，表示（？0）

Perl，PHP，Notepad++，R:，Python:with（？V1）
用于Perl行为

Ruby使用
使用Ruby 2.0，可以使用\g
调用完整模式
\((?>[^)(]+|\g<0>)*\)

JS、Java和其他正则表达式风格，无递归，最多可嵌套两级：
\((?:[^)(]+|\((?:[^)(]+|\([^)(]*\))*\))*\)

.深入到模式。

在不平衡括号上更快地失败

Java：一个有趣的例子






除此之外，还有其他支持递归构造的正则表达式
Lua
使用%b（）
（%b{}
/%b[]
表示大括号/方括号）：

用于字符串中的s.gmatch（“提取（a（b）c）和（（d）f（g）），“%b（）”）是否打印结束
（请参阅）

拉库（前Perl6）：
不重叠的多个平衡圆括号匹配：
my regex paren_any { '(' ~ ')' [ <-[()]>+ || <&paren_any> ]* }
say "Extract (a(b)c) and ((d)f(g))" ~~ m:g/<&paren_any>/;
# => (｢(a(b)c)｣ ｢((d)f(g))｣)

say "Extract (a(b)c) and ((d)f(g))" ~~ m:ov:g/<&paren_any>/;
# => (｢(a(b)c)｣ ｢(b)｣ ｢((d)f(g))｣ ｢(d)｣ ｢(g)｣)

示例用法：
String s = "some text(text here(possible text)text(possible text(more text)))end text";
List<String> balanced = getBalancedSubstrings(s, '(', ')', true);
System.out.println("Balanced substrings:\n" + balanced);
// => [(text here(possible text)text(possible text(more text)))]

String s=“一些文本（此处文本（可能文本）文本（可能文本（更多文本）））结束文本”；
List balanced=getBalancedSubstrings，“（”，“）”，true；
System.out.println（“平衡子字符串：\n”+平衡）；
//=>[（此处文本（可能文本）文本（可能文本（更多文本）））]
您需要第一个和最后一个括号。请使用以下内容：
<
[^<>]*
(
    (
        (?<Open><)
        [^<>]*
    )+
    (
        (?<Close-Open>>)
        [^<>]*
    )+
)*
(?(Open)(?!))
>

str.indexOf（“（”）；-它将给您第一次出现
str.lastIndexOf（'）；-最后一个
所以你需要一条线
String searchedString = str.substring(str1.indexOf('('),str1.lastIndexOf(')');

“”“
下面是一个简单的python程序，演示如何使用正则表达式
表达式来编写paren匹配递归解析器。
此解析器识别由括号、括号和，
大括号和符号，但适用于任何一组
打开/关闭模式。这是重新打包的地方
协助解析。
"""
进口稀土
#下面的模式识别一个序列，该序列包括：
#1.不在打开/关闭字符串集中的任何字符。
#2.打开/关闭字符串之一。
#3.字符串的其余部分。
# 
#没有理由不让开场模式成为最佳模式
#与结束模式相同，因此带引号的字符串可以
#被包括在内。但是引用在里面不会被忽略
#引用。这需要更多的逻辑。。。。
pat=re.compile（“”）
( .*? )
( \( | \) | \[ | \] | \{ | \} | \< | \> |
\“|\”|开始|结束|$）
( .* )
“”，re.X）
#下面字典的键是开头的字符串，
#这些值是相应的结束字符串。
#例如，“（”是一个开头字符串，“）”是它的
#结束字符串。
匹配={“（“：”），
"[" : "]",
"{" : "}",
"",
'"' : '"',
"'" : "'",
“开始”：“结束”}
#下面的过程匹配字符串s并返回
#与打开/关闭的嵌套匹配的递归列表
#美国的模式。
def matchnested（s，term=”“）：
lst=[]
尽管如此：
m=匹配匹配（s）
如果m.group（1）！=“”：
一级附加（m组（1））
如果m.群（2）=项：
返回lst，m.组（3）
如果m.group（2）处于匹配状态：
项目，s=匹配嵌套（m.group（3），匹配[m.group（2）]）
一级附加（m组（2））
第一个附加项（项目）
\((?:[^)(]+|\((?:[^)(]+|\([^)(]*\))*\))*\)

my regex paren_any { '(' ~ ')' [ <-[()]>+ || <&paren_any> ]* }
say "Extract (a(b)c) and ((d)f(g))" ~~ m:g/<&paren_any>/;
# => (｢(a(b)c)｣ ｢((d)f(g))｣)

say "Extract (a(b)c) and ((d)f(g))" ~~ m:ov:g/<&paren_any>/;
# => (｢(a(b)c)｣ ｢(b)｣ ｢((d)f(g))｣ ｢(d)｣ ｢(g)｣)

public static List<String> getBalancedSubstrings(String s, Character markStart, 
                                 Character markEnd, Boolean includeMarkers) 

{
        List<String> subTreeList = new ArrayList<String>();
        int level = 0;
        int lastOpenDelimiter = -1;
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (c == markStart) {
                level++;
                if (level == 1) {
                    lastOpenDelimiter = (includeMarkers ? i : i + 1);
                }
            }
            else if (c == markEnd) {
                if (level == 1) {
                    subTreeList.add(s.substring(lastOpenDelimiter, (includeMarkers ? i + 1 : i)));
                }
                if (level > 0) level--;
            }
        }
        return subTreeList;
    }
}

String s = "some text(text here(possible text)text(possible text(more text)))end text";
List<String> balanced = getBalancedSubstrings(s, '(', ')', true);
System.out.println("Balanced substrings:\n" + balanced);
// => [(text here(possible text)text(possible text(more text)))]

String searchedString = str.substring(str1.indexOf('('),str1.lastIndexOf(')');

"""
Here is a simple python program showing how to use regular
expressions to write a paren-matching recursive parser.

This parser recognises items enclosed by parens, brackets,
braces and <> symbols, but is adaptable to any set of
open/close patterns.  This is where the re package greatly
assists in parsing. 
"""

import re


# The pattern below recognises a sequence consisting of:
#    1. Any characters not in the set of open/close strings.
#    2. One of the open/close strings.
#    3. The remainder of the string.
# 
# There is no reason the opening pattern can't be the
# same as the closing pattern, so quoted strings can
# be included.  However quotes are not ignored inside
# quotes.  More logic is needed for that....


pat = re.compile("""
    ( .*? )
    ( \( | \) | \[ | \] | \{ | \} | \< | \> |
                           \' | \" | BEGIN | END | $ )
    ( .* )
    """, re.X)

# The keys to the dictionary below are the opening strings,
# and the values are the corresponding closing strings.
# For example "(" is an opening string and ")" is its
# closing string.

matching = { "(" : ")",
             "[" : "]",
             "{" : "}",
             "<" : ">",
             '"' : '"',
             "'" : "'",
             "BEGIN" : "END" }

# The procedure below matches string s and returns a
# recursive list matching the nesting of the open/close
# patterns in s.

def matchnested(s, term=""):
    lst = []
    while True:
        m = pat.match(s)

        if m.group(1) != "":
            lst.append(m.group(1))

        if m.group(2) == term:
            return lst, m.group(3)

        if m.group(2) in matching:
            item, s = matchnested(m.group(3), matching[m.group(2)])
            lst.append(m.group(2))
            lst.append(item)
            lst.append(matching[m.group(2)])
        else:
            raise ValueError("After <<%s %s>> expected %s not %s" %
                             (lst, s, term, m.group(2)))

# Unit test.

if __name__ == "__main__":
    for s in ("simple string",
              """ "double quote" """,
              """ 'single quote' """,
              "one'two'three'four'five'six'seven",
              "one(two(three(four)five)six)seven",
              "one(two(three)four)five(six(seven)eight)nine",
              "one(two)three[four]five{six}seven<eight>nine",
              "one(two[three{four<five>six}seven]eight)nine",
              "oneBEGINtwo(threeBEGINfourENDfive)sixENDseven",
              "ERROR testing ((( mismatched ))] parens"):
        print "\ninput", s
        try:
            lst, s = matchnested(s)
            print "output", lst
        except ValueError as e:
            print str(e)
    print "done"

      0     1     1     0
-> S1 -> S2 -> S2 -> S2 ->S1

re.findall(r'\(.+\)', s)

push(number) map(test(a(a()))) bass(wow, abc)
$$(groups) filter({ type: 'ORGANIZATION', isDisabled: { $ne: true } }) pickBy(_id, type) map(test()) as(groups)

const parser = str => {
  let ops = []
  let method, arg
  let isMethod = true
  let open = []

  for (const char of str) {
    // skip whitespace
    if (char === ' ') continue

    // append method or arg string
    if (char !== '(' && char !== ')') {
      if (isMethod) {
        (method ? (method += char) : (method = char))
      } else {
        (arg ? (arg += char) : (arg = char))
      }
    }

    if (char === '(') {
      // nested parenthesis should be a part of arg
      if (!isMethod) arg += char
      isMethod = false
      open.push(char)
    } else if (char === ')') {
      open.pop()
      // check end of arg
      if (open.length < 1) {
        isMethod = true
        ops.push({ method, arg })
        method = arg = undefined
      } else {
        arg += char
      }
    }
  }

  return ops
}

// const test = parser(`$$(groups) filter({ type: 'ORGANIZATION', isDisabled: { $ne: true } }) pickBy(_id, type) map(test()) as(groups)`)
const test = parser(`push(number) map(test(a(a()))) bass(wow, abc)`)

console.log(test)


[ { method: 'push', arg: 'number' },
  { method: 'map', arg: 'test(a(a()))' },
  { method: 'bass', arg: 'wow,abc' } ]

[ { method: '$$', arg: 'groups' },
  { method: 'filter',
    arg: '{type:\'ORGANIZATION\',isDisabled:{$ne:true}}' },
  { method: 'pickBy', arg: '_id,type' },
  { method: 'map', arg: 'test()' },
  { method: 'as', arg: 'groups' } ]

def extract_code(data):
    """ returns an array of code snippets from a string (data)"""
    start_pos = None
    end_pos = None
    count_open = 0
    count_close = 0
    code_snippets = []
    for i,v in enumerate(data):
        if v =='{':
            count_open+=1
            if not start_pos:
                start_pos= i
        if v=='}':
            count_close +=1
            if count_open == count_close and not end_pos:
                end_pos = i+1
        if start_pos and end_pos:
            code_snippets.append((start_pos,end_pos))
            start_pos = None
            end_pos = None

    return code_snippets


'/(\((?>[^()]+|(?1))*\))/'

\s*\w+[(][^+]*[)]\s*