Java 团队捕获的预期结果?

Java 团队捕获的预期结果?,java,regex,capture-group,Java,Regex,Capture Group,输出为 String line = "This order was placed for QT3000! OK?"; String pattern = "(.*)(\\d+)(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); if (m.

输出为

String line = "This order was placed for QT3000! OK?";
    String pattern = "(.*)(\\d+)(.*)";

    // Create a Pattern object
    Pattern r = Pattern.compile(pattern);

    // Now create matcher object.
    Matcher m = r.matcher(line);
    if (m.find()) {
      System.out.println("Found value: " + m.group(1));
      System.out.println("Found value: " + m.group(2));
      System.out.println("Found value: " + m.group(3));
    }
Found value: This order was placed for QT3000! OK?
Found value: 3000
Found value: This order was placed for QT3000! OK?
虽然我希望输出结果是

Found value: This order was placed for QT300
Found value: 0
Found value: ! OK?
我期望输出的原因是

Found value: This order was placed for QT3000! OK?
Found value: 3000
Found value: This order was placed for QT3000! OK?

我不知道什么时候我提到模式是
“(.*)(\\d+)(.*)”
;为什么我没有得到预期的输出?

在找到
\\d+
之前,
*
正在匹配(并消耗)尽可能多的字符。当它到达
\\d+
时,只有一个数字足够匹配

因此,您需要将
*
设置为惰性:

If pattern is  "(.*)"   output for m.group(1) is "This order was placed for QT3000! OK?"
If pattern is  "(\\d+)" output for m.group(1) is "3000"
好的,如果您想了解详细信息,
*
首先匹配整个字符串,然后一次回溯一个字符,这样正则表达式也可以匹配后面的
(\\d+)(.*)
。返回到此处的最后一个字符后:

(.*?)(\\d+)(.*)

正则表达式的其余部分(
(\\d+)(.*)
)被满足,因此匹配结束。

这是因为第一个
(.*)
太贪婪,并且尽可能多地消耗,同时仍然允许
(\d+)(.*)
匹配字符串的其余部分

基本上,比赛是这样进行的。在开始时,第一个
*
将吞噬整个字符串:

This order was placed for QT300
但是,由于我们无法在此处找到与
\d+
匹配的项,因此我们回溯:

This order was placed for QT3000! OK?
                                     ^
在这个位置,
\d+
可以匹配,因此我们继续:

This order was placed for QT3000! OK?
                                    ^
This order was placed for QT3000! OK?
                                   ^
...

This order was placed for QT3000! OK?
                               ^
*
将匹配字符串的其余部分

这就是您看到的输出的解释


您可以通过使第一个
(.*)
变懒来解决此问题:

If pattern is  "(.*)"   output for m.group(1) is "This order was placed for QT3000! OK?"
If pattern is  "(\\d+)" output for m.group(1) is "3000"
搜索
(.*?
的匹配项将以空字符串开始,当它回溯时,它将逐渐增加所占用的字符数:

(.*?)(\d+)(.*)
此时,
\d+
可以匹配,
*
也可以匹配,这将完成匹配尝试,输出将如您所期望的那样