Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java正则表达式:用空格和括号匹配URL_Java_Regex_Url - Fatal编程技术网

Java正则表达式:用空格和括号匹配URL

Java正则表达式:用空格和括号匹配URL,java,regex,url,Java,Regex,Url,使用Java正则表达式,我无法匹配带有空格(和)括号的URL,下面是一个代码示例,请您提供帮助。只有最后一个URL的E.jpeg有效 代码: public static void main(String[] args) { String content = "Lorem ipsum https://example.com/A B 123 4.pdf https://example.com/(C.jpeg https://example.com/D).jpeg https://

使用Java正则表达式,我无法匹配带有空格(和)括号的URL,下面是一个代码示例,请您提供帮助。只有最后一个URL的
E.jpeg
有效

代码

public static void main(String[] args) {
    String content = "Lorem ipsum https://example.com/A B 123 4.pdf   https://example.com/(C.jpeg   https://example.com/D).jpeg   https://example.com/E.jpeg";
    extractUrls(content);
}

public static void extractUrls(String text) {
    Pattern pat = Pattern.compile("(https?)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]", Pattern.CASE_INSENSITIVE);
    Matcher matcher = pat.matcher(text);
    while (matcher.find()) {
        System.out.println(matcher.group());
    }
}
https://example.com/A
https://example.com/
https://example.com/D
https://example.com/E.jpeg
输出

public static void main(String[] args) {
    String content = "Lorem ipsum https://example.com/A B 123 4.pdf   https://example.com/(C.jpeg   https://example.com/D).jpeg   https://example.com/E.jpeg";
    extractUrls(content);
}

public static void extractUrls(String text) {
    Pattern pat = Pattern.compile("(https?)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]", Pattern.CASE_INSENSITIVE);
    Matcher matcher = pat.matcher(text);
    while (matcher.find()) {
        System.out.println(matcher.group());
    }
}
https://example.com/A
https://example.com/
https://example.com/D
https://example.com/E.jpeg
预期输出:

https://example.com/A B 123 4.pdf
https://example.com/(C.jpeg
https://example.com/D).jpeg
https://example.com/E.jpeg
https://example.com/A B 123 4.pdf 
https://example.com/(C.jpeg 
https://example.com/D).jpeg 
https://example.com/E.jpeg

请看下面的代码:

import java.lang.Math; 
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MyClass {
    public static void main(String[] args) {
        String content = "Lorem ipsum https://example.com/A B 123 4.pdf   https://example.com/(C.jpeg   https://example.com/D).jpeg   https://example.com/E.jpeg";
        extractUrls(content);
    }

    public static void extractUrls(String text) {
        Pattern pat = Pattern.compile("(https?)://(([\\S]+)(\\s)?)*", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pat.matcher(text);
        while (matcher.find()) {
            System.out.println(matcher.group());
        }
    }
}
输出:

https://example.com/A B 123 4.pdf
https://example.com/(C.jpeg
https://example.com/D).jpeg
https://example.com/E.jpeg
https://example.com/A B 123 4.pdf 
https://example.com/(C.jpeg 
https://example.com/D).jpeg 
https://example.com/E.jpeg
解释:

https://example.com/A B 123 4.pdf
https://example.com/(C.jpeg
https://example.com/D).jpeg
https://example.com/E.jpeg
https://example.com/A B 123 4.pdf 
https://example.com/(C.jpeg 
https://example.com/D).jpeg 
https://example.com/E.jpeg
我假设文件名没有两个连续的空格,如示例所示

(https?://
标识子字符串
http://
https://

在这篇文章中我们有两组:
([\\S]+)(\\S)?
。它标识1个或多个字符(除空格外),后面只有1个或0个空白字符

使用字符
*
可以重复多次此过程

因此,我们的表达式理解,如果有2个或更多的空格,则是两个文件名之间的分隔

我希望它能有所帮助。

来自“第四只鸟”用户的回答解决了这个问题,正则表达式应该是:

http.*?\.(?:pdf|jpe?g)

尝试使用非贪婪量词
http.*.\(?:pdf | jpe?g)
或使用字符类使
更具体。我认为URL使用“+”而不是spacesHello“第四只鸟”::对于文本:->我使用了https.*.\(?:jpg | jpeg | png | pdf | doc docx)->但它从“docx”中删除了“x”,并显示为-->使用
docx | doc