Java 有没有办法将其与正则表达式匹配，而不是与循环匹配？_Java_Regex

Java 有没有办法将其与正则表达式匹配，而不是与循环匹配？

java regex

Java 有没有办法将其与正则表达式匹配，而不是与循环匹配？,java,regex,Java,Regex,我这里有一个函数，它计算引号外的大括号，忽略引号内的大括号：（根据我的用法传递一个字符串和一个“{”或“}”） public static int countCurlyBraces（字符串s，字符c）{ 整数计数=0； for（char cr:s.toCharArray（））{ 如果（cr==“”） if（stack.isEmpty（））堆栈推送（cr）；其他的 stack.pop（）； if（stack.size（）==1&&cr==c）计数++； } 返回StringUtil.coun

我这里有一个函数，它计算引号外的大括号，忽略引号内的大括号：（根据我的用法传递一个字符串和一个“{”或“}”）

public static int countCurlyBraces（字符串s，字符c）{
整数计数=0；
for（char cr:s.toCharArray（））{
如果（cr==“”）
if（stack.isEmpty（））
堆栈推送（cr）；
其他的
stack.pop（）；
if（stack.size（）==1&&cr==c）
计数++；
}
返回StringUtil.countMatches（s，c）-count；
}

我试图用正则表达式来代替它，但我有点麻烦，这可能吗

public static int countCurlyBraces（字符串s，字符c）{
Matcher a=Pattern.compile（“\”（.*？[“+c+”]）（.*？\”）.Matcher；
整数计数=0；
while（a.find（））
计数++；
返回StringUtil.countMatches（s，c）-count；
}

我用于测试的一个示例字符串是：

sdfg“srfg{rmjy}rmyrmy{rymundh”ecfvr{cerv}fes{dc”cf2234TC$#ct234“etw243T#$c”nhg

这将返回计数2，忽略引号中包含的两个大括号。正则表达式将所有大括号视为包含在引号中，并输出一个0

该文档看起来像：

LOCALE
user="XXXXXXX" time=1561234682/* "26-Jun-2019 23:00:03" */
{
  LOCALE="XXXXXXX"
}
SITE NAME="XxxXXxxx"
 user="XXXXXX" time=1568532503/* "26-Jun-2019 23:00:03" */
{
  SYSTEM_NAME="XXX-NNNNN"
  SYSTEM_IDENTIFIER="{XXXX-XXXX-XXX_XXX-XX}"
  SYSTEM_ID=NNNNN
  SYSTEM_ZONE_NAME="XXXXXX"
  DEFAULT_COMMUNICATION_TYPE=REDUNDANT
  IP_ADDR_AUTO_GEN=T
  PP_LAD="aGx{4"
  PVQ_LIMIT=0.5
  BCK_LIMIT=0.3
  MNN_LIMIT=0.1
  COMPANY_NAME=""
  DISPLAY_VERSION_CONTROL_ENABLED=F
}

循环的CPU效率可能更高。但在这里，我将使用2阶段正则表达式：

String input="sdfg\"srfg{rmjy#\"rmyrmy{rymundh\"ecfvr{cerv#\"fes{dc\"cf2234TC@$#ct234\"etw243T@#$c\"nhg";


input=input.replaceAll("\"[^\"]*\"", ""); // becomes sdfgrmyrmy{rymundhfes{dcetw243T@#$c"nhg

input=input.replaceAll("[^{]", ""); //becomes {{

return input.length();//2

第二个正则表达式可以使用传递的实际字符（如果将其限制为{和}），它应该可以工作

input=input.replaceAll("[^"+c+"]", "");

如果我们合并两个正则表达式，它的可读性就会降低，只有一行

input=input.replaceAll("\"[^\"]*\"|[^"+c+"]", "");

循环的CPU效率可能更高。但在这里，我将使用2阶段正则表达式：

String input="sdfg\"srfg{rmjy#\"rmyrmy{rymundh\"ecfvr{cerv#\"fes{dc\"cf2234TC@$#ct234\"etw243T@#$c\"nhg";


input=input.replaceAll("\"[^\"]*\"", ""); // becomes sdfgrmyrmy{rymundhfes{dcetw243T@#$c"nhg

input=input.replaceAll("[^{]", ""); //becomes {{

return input.length();//2

第二个正则表达式可以使用传递的实际字符（如果将其限制为{和}），它应该可以工作

input=input.replaceAll("[^"+c+"]", "");

如果我们合并两个正则表达式，它的可读性就会降低，只有一行

input=input.replaceAll("\"[^\"]*\"|[^"+c+"]", "");

你的方法是一个非常迂回的方式来实现你想要的，它是相当低效的

首先，您要遍历字符串并计算引号内的字符，然后再次遍历整个字符串，计算所有匹配的字符并减去引号内匹配的字符数…whyyy？而只需根据需要计算引号外的字符数

其次，通过使用

s.tocharray（）

您基本上保留了重复的数据，并将字符串的内存占用增加了一倍；相反，只需通过

charAt

访问其数据即可

第三，不需要使用堆栈来跟踪是否在引号内；只需翻转布尔值即可

以下是我对您的方法的说明：

public static int countCurlyBraces(String s, char c) {
    Deque<Character> stack = ...; // I'm assuming 'stack' is some kind of Deque
    int count = 0;
    // doubling memory usage of the string by copying the chars into another array with 's.toCharArray()'
    // for each character in that string...
    for (char cr : s.toCharArray()) {
        // using a stack to keep track if you are inside quotes? just flip a boolean instead
        if (cr == '"')
            if (stack.isEmpty())
                stack.push(cr);
            else
                stack.pop();

        // if inside quotes and the character matches the target, then count it..
        // I thought you wanted to count the characters outside the quotes?
        if (stack.size() == 1 && cr == c)
            count++;
    }

    // iterate through the whole string again and count ALL the characters
    // then subtract the number inside the strings from the total to get the number outside strings
    return StringUtil.countMatches(s, c) - count;
}

你的方法是一个非常迂回的方式来实现你想要的，它是相当低效的

其次，通过使用

s.tocharray（）

您基本上保留了重复的数据，并将字符串的内存占用增加了一倍；相反，只需通过

charAt

访问其数据即可

第三，不需要使用堆栈来跟踪是否在引号内；只需翻转布尔值即可

以下是我对您的方法的说明：

public static int countCurlyBraces(String s, char c) {
    Deque<Character> stack = ...; // I'm assuming 'stack' is some kind of Deque
    int count = 0;
    // doubling memory usage of the string by copying the chars into another array with 's.toCharArray()'
    // for each character in that string...
    for (char cr : s.toCharArray()) {
        // using a stack to keep track if you are inside quotes? just flip a boolean instead
        if (cr == '"')
            if (stack.isEmpty())
                stack.push(cr);
            else
                stack.pop();

        // if inside quotes and the character matches the target, then count it..
        // I thought you wanted to count the characters outside the quotes?
        if (stack.size() == 1 && cr == c)
            count++;
    }

    // iterate through the whole string again and count ALL the characters
    // then subtract the number inside the strings from the total to get the number outside strings
    return StringUtil.countMatches(s, c) - count;
}

不确定是否可以做到这一点，如果您在代码中看到了奇数或偶数的

标记到此为止。据我所知，正则表达式不起作用。以防万一：如果您试图解析JSON，那么我建议只使用现有的JSON解析器，如Jackson。该文档不是JSON，看起来像是我添加到Post中的内容。不确定是否可以做到这一点，在您的代码中，如果您有JSON，您将跟踪它en是一个奇数或偶数的

标记到这一点。据我所知，正则表达式不起作用。以防万一：如果您试图解析JSON，那么我建议只使用现有的JSON解析器，例如Jackson。文档不是JSON，看起来我添加到POSTPU的效率肯定是最重要的。我通过text文件中有1亿多行，几乎每行都需要使用此功能。目前该程序只需几秒钟即可完成此操作，但对用户来说是一个明显的阻力。我希望正则表达式解决方案可能会更快。循环在O（n）中进行时间，我不确定正则表达式的时间复杂度。顺便说一句，你说过函数在引号外寻找大括号，但代码似乎在引号内。在正则表达式内，代码仍然以某种形式使用循环，但没有高度优化的抖动代码的好处。它找到所有大括号，然后减去找到的大括号我用引号括起来。所以它确实有效地计算了外部的数据。但是，以一种相反的方式……我想这就是重写100次后的结果。你用正则表达式提出的方法，虽然看起来更干净，但比我最初使用的循环方法慢5.4倍。所以，也许它已经尽可能地优化了，我们只需要处理它.CPU效率绝对是最重要的。我正在解析一个包含1亿多行的文本文件，几乎每行都需要使用此功能。目前，该程序只需几秒钟即可完成此操作，但对用户来说，这是一个明显的阻力。我希望正则表达式解决方案可能会更快。循环通过我否（n）时间，我不确定正则表达式的时间复杂度。顺便说一句，你说过函数在引号外寻找大括号，但代码似乎在引号内。在正则表达式内，代码仍然以某种形式使用循环，但没有高度优化的抖动代码的好处。它找到所有大括号，然后减去找到的大括号我在引号里。所以有效地，它把外面的都算进去了