Java 如何通过RegEx或replaceAll删除包含特殊字符的字符串部分?

Java 如何通过RegEx或replaceAll删除包含特殊字符的字符串部分?,java,regex,Java,Regex,以下是字符串: 1. "AAA BBB CCCCC CCCCCCC" 2. " AAA BBB DDDD DDDD DDDDD" 3. " EEE FFF GGGGG GGGGG" 开头的空格以及第一个和第二个单词之间的空格可能会有所不同。 所以我需要一个正则表达式来删除第三个单词之前的所有内容,以便它总是返回 “ccccccccccc”或“dddddddddddddddddd”或“ggggggggg”。 假设它可以由正则表达式来完成,而

以下是字符串:

1. "AAA BBB  CCCCC CCCCCCC"
2. "  AAA              BBB  DDDD DDDD DDDDD"
3. "    EEE         FFF  GGGGG GGGGG"
开头的空格以及第一个和第二个单词之间的空格可能会有所不同。 所以我需要一个正则表达式来删除第三个单词之前的所有内容,以便它总是返回 “ccccccccccc”或“dddddddddddddddddd”或“ggggggggg”。
假设它可以由正则表达式来完成,而不是解析字符串中的所有单词

您需要使用组匹配来解析所需的数据

String result = null;

try {
    Pattern regex = Pattern.compile("\\s*\\w+\\s*\\w+\\s*([\\w| ]+)");
    Matcher regexMatcher = regex.matcher("  AAA              BBB  DDDD DDDD DDDDD");
    if (regexMatcher.find()) {
        result = regexMatcher.group(1); // result = "DDDD DDDD DDDDD"
    } 
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}
正则表达式解释

"\\s" +           // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   "*" +            // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"\\w" +           // Match a single character that is a “word character” (letters, digits, and underscores)
   "+" +            // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"\\s" +           // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   "*" +            // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"\\w" +           // Match a single character that is a “word character” (letters, digits, and underscores)
   "+" +            // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
"\\s" +           // Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   "*" +            // Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
"(" +            // Match the regular expression below and capture its match into backreference number 1
   "[\\w| ]" +       // Match a single character present in the list below
                       // A word character (letters, digits, and underscores)
                       // One of the characters “| ”
      "+" +            // Between one and unlimited times, as many times as possible, giving back as needed (greedy)
")" 
这个正则表达式可以工作

\s*\w+\s+\w+\s+(.+$)

JAVA代码


与@rock321987的答案类似,您可以修改正则表达式以使用量词忽略前面不需要的任意数量的单词

\s*(?:\w+\s+){2}(.+$)

或在Java中:

"\\s*(?:\\w+\\s+){2}(.+$)"

:使()中的模式成为非捕获组。{}中的数字是要忽略的单词数,后面跟有空格。

您不能将需求转储到此处,然后让人帮您完成工作。展示你的努力。@rock321987-将其作为答案发布。这正是问题所在,没有道理。如果只有2个单词,字符串是否返回空值。无论如何,只需将
^\s*(?:\s+(?:\s+|$){2}
替换为零。我还需要删除第一个和第二个单词
"\\s*(?:\\w+\\s+){2}(.+$)"