Java Regex试图一个接一个地获取一些文本，但没有'；行不通_Java_Regex

Java Regex试图一个接一个地获取一些文本，但没有'；行不通

java regex

Java Regex试图一个接一个地获取一些文本，但没有'；行不通,java,regex,Java,Regex,我得到的代码应该从url检索id值： String xmlTag = "http://www.facebook.com/profile.asp?id=123456789"; xmlTag = xmlTag.replaceAll("/", "//"); //regex variables final String regexUrl = "(?:(?:http|https):\\//\\//)?(?:www.)?facebook.co

我得到的代码应该从url检索id值：

        String xmlTag = "http://www.facebook.com/profile.asp?id=123456789";
        xmlTag = xmlTag.replaceAll("/", "//");

        //regex variables
        final String regexUrl = "(?:(?:http|https):\\//\\//)?(?:www.)?facebook.com\\//(?:(?:\\w)*#!\\//)?(?:[?\\w\\-]*\\//)?(?:profile.asp\\?id=(?=\\d.*))?([\\w\\-]*)?";
        final Pattern patternUrl = Pattern.compile(regexUrl);
        final Matcher matcherUrl = patternUrl.matcher(xmlTag);  

        String urlResult = matcherUrl.group(0);         
        System.out.println("group(0) = " + urlResult);
        String regexId = "(?<=http:////www.facebook.com//profile.asp?id=).*";
        System.out.println("regexId =   " + regexId);

        final Pattern patternId = Pattern.compile(regexId);
        final Matcher matcherId = patternId.matcher(urlResult);         
        System.out.println("id = " + matcherId.matches());

我错过了什么吗？

我试着说：

(?:(?:http|https):////)?(?:www\\.)?facebook.com//(?:(?:[\w\-]*))?(?:profile.asp\?id=(?=\d.*))?([\\w\\-]*)?

与样品

http:////www.facebook.com//profile.asp?id=123456789

到

尝试使用此选项并仅转义必要的序列
这就奏效了。

我知道这是针对JS的，但是应该没有太大的区别

如果您的目标是从URL中查找id，那么我建议使用一个更简单的正则表达式，而不是使用这么长的正则表达式

示例：

    String xmlTag = "http://www.facebook.com/profile.asp?id=123456789";
    String regexId = "\\?id=(\\d+)";
    final Pattern patternId = Pattern.compile(regexId);
    final Matcher matcherId = patternId.matcher(xmlTag);
    System.out.println("found id = " + matcherId.find());
    System.out.println("id = " + matcherId.group(1));

为了能够使用group，首先需要使模式遍历文本数据。您可以通过调用

匹配

查找

或

查找

这是必要的，因为可能有许多子字符串可以匹配我们的正则表达式，所以

group

无法知道我们要接收哪个子字符串。

假设我们有regex

a（\w）

，它找到两个字母，其中第一个字母是

a

，我们只想得到第二个字母。对于像

abacad

这样的数据，在matcher上调用

group（）

的结果应该是什么？它应该是

还是

？Regex无法知道我们对其中哪一个感兴趣，而且

group

一次只能返回一个值。因此，我们的工作是让正则表达式引擎遍历并找到匹配项，然后才能使用它（或它的某些部分）。

@Pshemo，你是对的，因为使用了find而不是match，.*是不需要的，感谢你捕捉到了这一点，它将在答案中修复这一点。

    String xmlTag = "http://www.facebook.com/profile.asp?id=123456789";
    String regexId = "\\?id=(\\d+)";
    final Pattern patternId = Pattern.compile(regexId);
    final Matcher matcherId = patternId.matcher(xmlTag);
    System.out.println("found id = " + matcherId.find());
    System.out.println("id = " + matcherId.group(1));