Java 使用xpath读取xhtml标记时出现问题_Java_Xpath_Xhtml

Java 使用xpath读取xhtml标记时出现问题

java xpath

Java 使用xpath读取xhtml标记时出现问题,java,xpath,xhtml,Java,Xpath,Xhtml,我使用xpath读取xhtml文档，我想读取xhtml文件的标记中的所有元素。为此，我正在做类似的事情 XPath xpath = XPathFactory.newInstance().newXPath(); XPathExpression expr = xpath.compile("//p[2]/*"); Object result = expr.evaluate(doc, XPathConstants.NODESET); No

我使用xpath读取xhtml文档，我想读取xhtml文件的

标记中的所有元素。为此，我正在做类似的事情

XPath xpath = XPathFactory.newInstance().newXPath();                
XPathExpression expr = xpath.compile("//p[2]/*");                 
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
    System.out.println("Nodes>>>>>>>>"+nodes.item(i).getNodeValue());
}

XPath=XPathFactory.newInstance（）.newXPath（）；
XPathExpression expr=xpath.compile（“//p[2]/*”）；
Object result=expr.evaluate（doc，XPathConstants.NODESET）；
节点列表节点=（节点列表）结果；
对于（int i=0；i>>>>>”+节点.item（i）.getNodeValue（））；
}

XHTML示例如下所示

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html xmlns="http://www.w3.org/1999/xhtml">
    <head><title>test</title></head>
    <body>
        <p class="default"> <span style="color: #000000; font-size: 12pt; font-family: sans-serif"> Test Doc</span> </p> 
        <p class="default"> <span style="color: #000000; font-size: 12pt; font-family: sans-serif"> Test Doc1</span> </p>
        <p class="default"> <span style="color: #000000; font-size: 12pt; font-family: sans-serif"> Test Doc2</span> </p>
    </body>
</html>


测试
测试文档
测试文档1
测试文档2

但是我无法获取

标记内的节点，无法进入for循环

有人能帮我解决这个问题吗

提前感谢

您的代码正在尝试打印元素节点的

节点值

s，这不太可能是您想要的。我希望您需要文本节点的

nodeValue

另一个问题可能是名称空间。xpath似乎正在尝试匹配无命名空间中的

元素，而它可能应该尝试匹配

元素http://www.w3.org/1999/xhtml名称空间

您的代码正在尝试打印元素节点的

nodeValue

s，这不太可能是您想要的。我希望您需要文本节点的

nodeValue

       XPathExpression expr = xpath.compile(".//*[local-name()='p'][@id='ur_id']");

另一个问题可能是名称空间。xpath似乎正在尝试匹配无命名空间中的

元素，而它可能应该尝试匹配

元素http://www.w3.org/1999/xhtml名称空间

       XPathExpression expr = xpath.compile(".//*[local-name()='p'][@id='ur_id']");

你能检查一下这个吗？我想这会让你得到你的节点。在解析中访问并理解XPath的基础知识会很好

你能检查一下这个吗？我想这会让你得到你的节点。访问并理解解析中XPath的基础知识会很好。

您可以使用（）将节点提取为通用Java列表

String expr = "//p[2]/*";

Map<String, String> ns = new Map<String, String>;
ns.put("html", "http://www.w3.org/1999/xhtml");

List<String> nodeValues = XPathAPI.html.selectNodeListAsStrings(doc, expr, ns);
for (String nodeValue : nodesValues) {
    System.out.println("Nodes>>>>>>>> " + nodeValue);
}

String expr=“//p[2]/*”；
Map ns=新地图；
ns.put（“html”http://www.w3.org/1999/xhtml");
列出nodeValues=XPathAPI.html。选择nodelistasString（doc、expr、ns）；
for（字符串nodeValue:nodesValues）{
System.out.println（“节点>>>>>>”+nodeValue）；
}

或

List nodeValues=XPathAPI.html.selectListOfNodes（doc、expr、ns）；
用于（节点：节点）{
System.out.println（“节点>>>>>>”+node.getTextContent（））；
}

免责声明：我是XPathAPI库的作者。

您可以使用（）将节点提取为通用Java列表

String expr = "//p[2]/*";

Map<String, String> ns = new Map<String, String>;
ns.put("html", "http://www.w3.org/1999/xhtml");

List<String> nodeValues = XPathAPI.html.selectNodeListAsStrings(doc, expr, ns);
for (String nodeValue : nodesValues) {
    System.out.println("Nodes>>>>>>>> " + nodeValue);
}

String expr=“//p[2]/*”；
Map ns=新地图；
ns.put（“html”http://www.w3.org/1999/xhtml");
列出nodeValues=XPathAPI.html。选择nodelistasString（doc、expr、ns）；
for（字符串nodeValue:nodesValues）{
System.out.println（“节点>>>>>>”+nodeValue）；
}

或

List nodeValues=XPathAPI.html.selectListOfNodes（doc、expr、ns）；
用于（节点：节点）{
System.out.println（“节点>>>>>>”+node.getTextContent（））；
}

免责声明：我是XPathAPI库的作者。

我是新手，您能给出详细的答案吗？请在您的问题中添加一个XHTML示例-包括html标记的完整文件-您希望可以使用，但无法使用。如果您使用的是名称空间，这可能是您无法访问标记的原因。为此，您可以将xpath表达式细化为“/*[local-name（）='p']”。这将返回不考虑名称空间的节点。@Alohci，我通过添加示例xhtml文件编辑了我的问题，请使用look@Krishnanunni，现在我可以使用本地名称获取节点值，感谢您的时间，如果我有多个段落，并且我希望基于say some id访问特定段落，我怎么处理它呢？我是新手，你能给出详细的答案吗？请在你的问题中添加一个XHTML示例-包括html标记的完整文件-你希望它可以工作，但不能。如果你使用的是名称空间，这可能是你无法访问标记的原因。为此，您可以将xpath表达式细化为“/*[local-name（）='p']”。这将返回不考虑名称空间的节点。@Alohci，我通过添加示例xhtml文件编辑了我的问题，请使用look@Krishnanunni，现在我可以使用本地名称获取节点值，感谢您的时间，如果我有多个段落，并且我希望基于say some id访问特定段落，如何处理？//XXX[@attrib='abc']将选择属性为attrib='abc'的节点。//XXX[@attrib='abc']将选择属性为attrib='abc'的节点