Java 获取XML文档中子字符串的XPath

Java 获取XML文档中子字符串的XPath,java,xml,xpath,Java,Xml,Xpath,我需要在XML文档中找到文本元素的确切XPath。我认为一种方法是将文档转换为字符串,在子字符串周围添加临时标记,将其转换回文档,然后查找XPath 这就是我到目前为止所做的: public String findXPathInXMLString(int startIndex, int endIndex, String string) throws IOException, ParserConfigurationException, SAXException { Conversion c

我需要在XML文档中找到文本元素的确切XPath。我认为一种方法是将文档转换为字符串,在子字符串周围添加临时标记,将其转换回文档,然后查找XPath

这就是我到目前为止所做的:

public String findXPathInXMLString(int startIndex, int endIndex, String string) throws IOException, ParserConfigurationException, SAXException {
    Conversion conversion = new Conversion();
    String xpath;

    //Step 1. Replace start to end index with temporary tag in string document
    StringBuilder stringBuilder = new StringBuilder(string);
    stringBuilder.replace(startIndex, endIndex, "<findXPathInXMLStringTemporaryTag>" + string.substring(startIndex, endIndex) + "</findXPathInXMLStringTemporaryTag>");

    //Step 2. Convert string document to DOM document & Find XPath of temporary tag in DOM document
    xpath = "/" + getXPath(conversion.stringToDocument(stringBuilder.toString()), "findXPathInXMLStringTemporaryTag");

    //Step 3. Cut off last part of the XPath
    //xpath = xpath.substring(0, 2).replace("/documentXPathTemporaryTag", "");

    //Step 4. Return the XPath
    return xpath;
}

public String getXPath(Document root, String elementName) {
    try {
        XPathExpression expr = XPathFactory.newInstance().newXPath().compile("//" + elementName);
        Node node = (Node) expr.evaluate(root, XPathConstants.NODE);

        if (node != null) {
            return getXPath(node);
        }
    } catch (XPathExpressionException e) {
    }

    return null;
}

public String getXPath(Node node) {
    if (node == null || node.getNodeType() != Node.ELEMENT_NODE) {
        return "";
    }
    return getXPath(node.getParentNode()) + "/" + node.getNodeName();
}
public String findXPathInXMLString(int-startIndex,int-endIndex,String-String)抛出IOException、ParserConfigurationException、SAXException{
转换=新转换();
字符串xpath;
//步骤1.用字符串文档中的临时标记替换从开始到结束的索引
StringBuilder StringBuilder=新的StringBuilder(字符串);
替换(startIndex,endIndex,“+string.substring(startIndex,endIndex)+”);
//步骤2.将字符串文档转换为DOM文档&在DOM文档中查找临时标记的XPath
xpath=“/”+getXPath(conversion.stringToDocument(stringBuilder.toString()),“findXPathInXMLStringTemporaryTag”);
//步骤3.切断XPath的最后一部分
//xpath=xpath.substring(0,2).replace(“/documentXPathTemporaryTag”,”);
//步骤4.返回XPath
返回xpath;
}
公共字符串getXPath(文档根,字符串elementName){
试一试{
XPathExpression expr=XPathFactory.newInstance().newXPath().compile(“/”+elementName);
Node Node=(Node)expr.evaluate(root,XPathConstants.Node);
如果(节点!=null){
返回getXPath(节点);
}
}捕获(XPathExpressionException e){
}
返回null;
}
公共字符串getXPath(节点){
if(node==null | | node.getNodeType()!=node.ELEMENT_node){
返回“”;
}
返回getXPath(node.getParentNode())+“/”+node.getNodeName();
}
到目前为止,我遇到的问题是方法
getXPath
没有放置
[x]
,因此返回的XPath是错误的,因为子字符串可能位于特定标记的
[3]
rd实例中,在这种情况下,XPath将应用于具有相同路径的所有节点。我想得到一个只能引用一个特定元素的精确路径。

好的,这是怎么回事

我将
startIndex
endIndex
更改为
index
。可以在文本中的单个点追加临时节点

public static String findXPathInXMLString(int index, String string) throws XPathExpressionException, SAXException, ParserConfigurationException, IOException {
    String xpath;

    //Step 1. Insert temporary tag in insert location
    StringBuilder stringBuilder = new StringBuilder(string);
    stringBuilder.insert(index, "<findXPathInXMLStringTemporaryTag />");

    Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(
        new ByteArrayInputStream(stringBuilder.toString().getBytes())
      );

    //Step 2. Convert string document to DOM document & Find XPath of temporary tag in DOM document
    xpath = getXPath(document, "findXPathInXMLStringTemporaryTag");

    //Step 3. Cut off last part of the XPath
    xpath = xpath.replace("/findXPathInXMLStringTemporaryTag", "");

    //Step 4. Return the XPath
    return xpath;
}

private static String getXPath(Document root, String elementName) throws XPathExpressionException 
{
  XPathExpression expr = XPathFactory.newInstance().newXPath().compile("//"+elementName);
  Node node = (Node)expr.evaluate(root, XPathConstants.NODE);


  if(node != null) {
      return getXPath(node);
  }

  return null;
}

private static String getXPath(Node node) throws XPathExpressionException {
    if(node == null || node.getNodeType() != Node.ELEMENT_NODE) {
        return "";
    }

    return getXPath(node.getParentNode()) + "/" + node.getNodeName() + getIndex(node);
}

private static String getIndex(Node node) throws XPathExpressionException {
    XPathExpression expr = XPathFactory.newInstance().newXPath().compile("count(preceding-sibling::*[local-name() = '" + node.getNodeName() + "'])");
    int result = (int)(double)(Double)expr.evaluate(node, XPathConstants.NUMBER);

    if(result == 0){
        return "";
    }
    else {
        return "[" + (result + 1) + "]";
    }
}
public静态字符串findXPathInXMLString(int索引,字符串字符串)抛出XPathExpressionException、SAXException、ParserConfiguration异常、IOException{
字符串xpath;
//步骤1.在插入位置插入临时标记
StringBuilder StringBuilder=新的StringBuilder(字符串);
stringBuilder.insert(索引“”);
Document Document=DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(
新建ByteArrayInputStream(stringBuilder.toString().getBytes())
);
//步骤2.将字符串文档转换为DOM文档&在DOM文档中查找临时标记的XPath
xpath=getXPath(文档,“findXPathInXMLStringTemporaryTag”);
//步骤3.切断XPath的最后一部分
xpath=xpath.replace(“/findXPathInXMLStringTemporaryTag”,”);
//步骤4.返回XPath
返回xpath;
}
私有静态字符串getXPath(文档根,字符串elementName)抛出XPathExpressionException
{
XPathExpression expr=XPathFactory.newInstance().newXPath().compile(“/”+elementName);
Node Node=(Node)expr.evaluate(root,XPathConstants.Node);
如果(节点!=null){
返回getXPath(节点);
}
返回null;
}
私有静态字符串getXPath(节点节点)抛出XPathExpressionException{
if(node==null | | node.getNodeType()!=node.ELEMENT_node){
返回“”;
}
返回getXPath(node.getParentNode())+“/”+node.getNodeName()+getIndex(node);
}
私有静态字符串getIndex(节点节点)抛出XPathExpressionException{
XPathExpression expr=XPathFactory.newInstance().newXPath().compile(“计数(前面的同级::*[local-name()='”“+node.getNodeName()+”)”);
int result=(int)(double)(double)expr.evaluate(节点,XPathConstants.NUMBER);
如果(结果==0){
返回“”;
}
否则{
返回“[”+(结果+1)+”];
}
}

这不起作用。如果临时在新节点中插入部分XML,然后获取XPath,索引(
[x]
)可能会与其他索引不同。也许你应该向我们解释一下你的最终目标是什么以及你为什么要这样做,有人可以告诉你怎么做。好吧,假设你有这个XML
,你想要第二个
gc
的路径,所以你在它周围放一个临时节点,然后得到
。现在您得到了
temp
的路径,即
/root/child/temp
。从该路径中删除
/temp
,以获取
/root/child
。这还没有获得第二个
gc
的路径。你拒绝解释为什么要这样做有什么原因吗?你的意思肯定是定义和未定义之间的完美平衡,对吧?在你明确定义和解释你要做的事情之前,我帮不上忙。在你这么做之前没人能帮你。太好了,非常好。非常感谢你。到目前为止,我对它所做的每一次测试都证明了它的有效性。