Java 从带名称空间的xml中提取节点及其全部内容

Java 从带名称空间的xml中提取节点及其全部内容,java,xml,xpath,Java,Xml,Xpath,给定以下名称空间的xml文件: <ptk:PrintTalk xmlns:ptk="http://linkToNameSpace"> xmlns:xjdf="http://linkToNamespace" <ptk:Request> <ptk:PurchaseOrder Currency="EUR"> <xjdf:XJDF name="someName" version="2.0"> <xjdf:ProductList&

给定以下名称空间的xml文件:

<ptk:PrintTalk xmlns:ptk="http://linkToNameSpace"> xmlns:xjdf="http://linkToNamespace"
 <ptk:Request>
  <ptk:PurchaseOrder Currency="EUR">
   <xjdf:XJDF name="someName" version="2.0">
     <xjdf:ProductList>
      <xjdf:Product>
       ...
      </xjdf:Product>
      <xjdf:OtherProduct>
       ...
      </xjdf:OtherProduct> 
      and many other products
     </xjdf:ProductList>
     <xjdf:ParameterSet>
      <xjdf:Parameter>
       ...
      </xjdf:Parameter> and so on until
   </xjdf:XJDF>
  </ptk:PurchaseOrder>
 </ptk:Request>
</ptk:PrintTalk>

但是这些表达并没有给我想要的结果。我使用IntellijIdea的内置xpath表达式计算器,编程语言是java。没有xpath库,只有java.xml*

更新

使用

我将每个节点作为一个单独的节点,其中没有任何子节点,例如。G会

<xjdf:ProductList>
 <xjdf:Product>
  ...
 </xjdf:Product>
</xjdf:ProductList> (here the product tag is a child of product list tag)

...
(此处产品标签是产品列表标签的子项)
导致

<xjdf:ProuctList>
<xjdf:Product>

我用于执行该操作的java代码:

@Override
public XJDF readFrom(
    final Class<XJDF> type, final Type genericType, final Annotation[] annotations, final MediaType mediaType,
    final MultivaluedMap<String, String> multivaluedMap, final InputStream inputStream
) throws IOException {
    try {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
        Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
        XPathFactory xPathFactory = XPathFactory.newInstance();
        XPath xPath = xPathFactory.newXPath();
        XPathExpression xPathExpression = xPath.compile("//ptk:PurchaseOrder//*");
        Document documentXjdf = (Document) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);
    } catch (Exception e) {
        throw new WebApplicationException("PrintTalk document could not be deserialized.", e);
    }
}
@覆盖
公共XJDF readFrom(
最终类类型,最终类型genericType,最终注释[]注释,最终MediaType MediaType,
最终多值映射多值映射,最终InputStream InputStream
)抛出IOException{
试一试{
DocumentBuilderFactory DocumentBuilderFactory=DocumentBuilderFactory.newInstance();
DocumentBuilder DocumentBuilder=documentBuilderFactory.newDocumentBuilder();
documentdocumentptk=documentBuilder.parse(新的InputSource(inputStream));
XPathFactory XPathFactory=XPathFactory.newInstance();
XPath=xPathFactory.newXPath();
XPathExpression XPathExpression=xPath.compile(“//ptk:PurchaseOrder//*”);
documentdocumentxjdf=(Document)xPathExpression.evaluate(documentPtk,XPathConstants.NODE);
}捕获(例外e){
抛出新的WebApplicationException(“无法反序列化PrintTalk文档。”,e);
}
}

这里要说明三个要点:

  • 默认情况下,
    DocumentBuilderFactory
    不支持命名空间,在创建
    DocumentBuilder
  • XPath不使用XML文档中的名称空间前缀映射,而是使用自己的
    名称空间上下文
  • 此查询返回的
    节点
    将不是
    文档
    ,而是
    元素
令人烦恼的是,Java核心类库中没有默认的
NamespaceContext
实现,因此您必须使用第三方实现(我通常使用),或者编写自己的接口实现

下面是一个使用
SimpleNamespaceContext
的示例:

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();

SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
nsCtx.bindNamespaceUri("p", "http://linkToNameSpace");
xPath.setNamespaceContext(nsCtx);

XPathExpression xPathExpression = xPath.compile("/p:PrintTalk/p:Request/p:PurchaseOrder/*");
Element documentXjdf = (Element) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);

您使用什么工具/语言/库来执行这些XPath表达式?你得到了什么输出?完全没有,没有XML标记的纯文本,或者其他什么?请记住XPath是一种用于从XML文件中选择节点的语言,一旦选择了这些节点(打印其字符串值、将其序列化为XML等),您将如何处理这些节点是另一个问题。我更新了我的问题以提供更多信息。我需要做的是从一个w3c文档中提取所需的内容,并从Java中提取的内容创建一个新的w3c文档。请发布您的Java代码。它解决了许多现成的问题(与名称空间、类型转换、对象映射、IO相关)。使用它可以节省大量代码。我非常感谢你的回答,伊恩!感谢您指出要实现的名称空间上下文,我以前不知道!由于返回的节点类型将是一个元素,因此如何将其转换为字节数组?imo,转换它将不起作用,因为DOMSource的构造函数有一个节点作为parameter@ArthurEirich您必须使用no op
转换器将其从
DOMSource
序列化为
StreamResult
,或者使用标准DOM
LSSerializer
机制。但你确定这是必要的吗?接下来需要对XML做什么?如果您使用JAXB或XStream之类的工具将XML转换为对象,那么您应该能够直接从DOM解组。然后我需要从这个字节数组创建另一个java类。@ArthurEirich在这种情况下是的,
Transformer
LSSerializer
方法将工作,使用
ByteArrayOutputStream
作为目标。
<xjdf:ProductList>
 <xjdf:Product>
  ...
 </xjdf:Product>
</xjdf:ProductList> (here the product tag is a child of product list tag)
<xjdf:ProuctList>
<xjdf:Product>
@Override
public XJDF readFrom(
    final Class<XJDF> type, final Type genericType, final Annotation[] annotations, final MediaType mediaType,
    final MultivaluedMap<String, String> multivaluedMap, final InputStream inputStream
) throws IOException {
    try {
        DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
        Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
        XPathFactory xPathFactory = XPathFactory.newInstance();
        XPath xPath = xPathFactory.newXPath();
        XPathExpression xPathExpression = xPath.compile("//ptk:PurchaseOrder//*");
        Document documentXjdf = (Document) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);
    } catch (Exception e) {
        throw new WebApplicationException("PrintTalk document could not be deserialized.", e);
    }
}
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();

SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
nsCtx.bindNamespaceUri("p", "http://linkToNameSpace");
xPath.setNamespaceContext(nsCtx);

XPathExpression xPathExpression = xPath.compile("/p:PrintTalk/p:Request/p:PurchaseOrder/*");
Element documentXjdf = (Element) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);