Java 团队捕获的预期结果?
输出为Java 团队捕获的预期结果?,java,regex,capture-group,Java,Regex,Capture Group,输出为 String line = "This order was placed for QT3000! OK?"; String pattern = "(.*)(\\d+)(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); if (m.
String line = "This order was placed for QT3000! OK?";
String pattern = "(.*)(\\d+)(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
System.out.println("Found value: " + m.group(3));
}
Found value: This order was placed for QT3000! OK?
Found value: 3000
Found value: This order was placed for QT3000! OK?
虽然我希望输出结果是
Found value: This order was placed for QT300
Found value: 0
Found value: ! OK?
我期望输出的原因是
Found value: This order was placed for QT3000! OK?
Found value: 3000
Found value: This order was placed for QT3000! OK?
我不知道什么时候我提到模式是
“(.*)(\\d+)(.*)”
;为什么我没有得到预期的输出?在找到\\d+
之前,*
正在匹配(并消耗)尽可能多的字符。当它到达\\d+
时,只有一个数字足够匹配
因此,您需要将*
设置为惰性:
If pattern is "(.*)" output for m.group(1) is "This order was placed for QT3000! OK?"
If pattern is "(\\d+)" output for m.group(1) is "3000"
好的,如果您想了解详细信息,*
首先匹配整个字符串,然后一次回溯一个字符,这样正则表达式也可以匹配后面的(\\d+)(.*)
。返回到此处的最后一个字符后:
(.*?)(\\d+)(.*)
正则表达式的其余部分(
(\\d+)(.*)
)被满足,因此匹配结束。这是因为第一个(.*)
太贪婪,并且尽可能多地消耗,同时仍然允许(\d+)(.*)
匹配字符串的其余部分
基本上,比赛是这样进行的。在开始时,第一个*
将吞噬整个字符串:
This order was placed for QT300
但是,由于我们无法在此处找到与\d+
匹配的项,因此我们回溯:
This order was placed for QT3000! OK?
^
在这个位置,\d+
可以匹配,因此我们继续:
This order was placed for QT3000! OK?
^
This order was placed for QT3000! OK?
^
...
This order was placed for QT3000! OK?
^
和*
将匹配字符串的其余部分
这就是您看到的输出的解释
您可以通过使第一个
(.*)
变懒来解决此问题:
If pattern is "(.*)" output for m.group(1) is "This order was placed for QT3000! OK?"
If pattern is "(\\d+)" output for m.group(1) is "3000"
搜索(.*?
的匹配项将以空字符串开始,当它回溯时,它将逐渐增加所占用的字符数:
(.*?)(\d+)(.*)
此时,\d+
可以匹配,*
也可以匹配,这将完成匹配尝试,输出将如您所期望的那样