Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/backbone.js/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Html parsing JSoup.clean()未保留相对URL_Html Parsing_Jsoup - Fatal编程技术网

Html parsing JSoup.clean()未保留相对URL

Html parsing JSoup.clean()未保留相对URL,html-parsing,jsoup,Html Parsing,Jsoup,我试过: Whitelist.relaxed(); Whitelist.relaxed().preserveRelativeLinks(true); Whitelist.relaxed().addProtocols("a","href","#","/","http","https","mailto","ftp"); Whitelist.relaxed().addProtocols("a","href","#","/","http","https","mailto","ftp").preserve

我试过:

Whitelist.relaxed();
Whitelist.relaxed().preserveRelativeLinks(true);
Whitelist.relaxed().addProtocols("a","href","#","/","http","https","mailto","ftp");
Whitelist.relaxed().addProtocols("a","href","#","/","http","https","mailto","ftp").preserveRelativeLinks(true);
它们都不起作用:当我尝试清理一个相对url时,比如
,我会删除
href
属性(
测试

我正在使用JSOUP1.8.2


有什么想法吗?

问题很可能源于对clean方法的调用。如果您提供基本URI,则所有URI都应按预期工作:

String html = ""
        + "<a href=\"/test.xhtml\">test</a>"
        + "<invalid>stuff</invalid>"
        + "<h2>header1</h2>";
String cleaned = Jsoup.clean(html, "http://base.uri", Whitelist.relaxed().preserveRelativeLinks(true));
System.out.println(cleaned);
String html=“”
+ ""
+“东西”
+“校长1”;
String cleaned=Jsoup.clean(html,“http://base.uri,Whitelist.relaxed().preserveRelativeLink(true));
系统输出打印项次(清洁);
上面的工作并保持了相关链接。使用
String cleaned=Jsoup.clean(html,Whitelist.relaxed().preserveRelativeLink(true))
但是该链接被删除

注意:

请注意,在处理相对链接时,输入文档必须具有 解析时设置适当的基URI,以便链接的协议 可以证实。无论“保持相对”的设置如何 链接选项,则该链接必须可根据基本URI解析为 允许的协议;否则,该属性将被删除