Java 从带名称空间的xml中提取节点及其全部内容
给定以下名称空间的xml文件:Java 从带名称空间的xml中提取节点及其全部内容,java,xml,xpath,Java,Xml,Xpath,给定以下名称空间的xml文件: <ptk:PrintTalk xmlns:ptk="http://linkToNameSpace"> xmlns:xjdf="http://linkToNamespace" <ptk:Request> <ptk:PurchaseOrder Currency="EUR"> <xjdf:XJDF name="someName" version="2.0"> <xjdf:ProductList&
<ptk:PrintTalk xmlns:ptk="http://linkToNameSpace"> xmlns:xjdf="http://linkToNamespace"
<ptk:Request>
<ptk:PurchaseOrder Currency="EUR">
<xjdf:XJDF name="someName" version="2.0">
<xjdf:ProductList>
<xjdf:Product>
...
</xjdf:Product>
<xjdf:OtherProduct>
...
</xjdf:OtherProduct>
and many other products
</xjdf:ProductList>
<xjdf:ParameterSet>
<xjdf:Parameter>
...
</xjdf:Parameter> and so on until
</xjdf:XJDF>
</ptk:PurchaseOrder>
</ptk:Request>
</ptk:PrintTalk>
或
但是这些表达并没有给我想要的结果。我使用IntellijIdea的内置xpath表达式计算器,编程语言是java。没有xpath库,只有java.xml*
更新
使用
我将每个节点作为一个单独的节点,其中没有任何子节点,例如。G会
<xjdf:ProductList>
<xjdf:Product>
...
</xjdf:Product>
</xjdf:ProductList> (here the product tag is a child of product list tag)
...
(此处产品标签是产品列表标签的子项)
导致
<xjdf:ProuctList>
<xjdf:Product>
我用于执行该操作的java代码:
@Override
public XJDF readFrom(
final Class<XJDF> type, final Type genericType, final Annotation[] annotations, final MediaType mediaType,
final MultivaluedMap<String, String> multivaluedMap, final InputStream inputStream
) throws IOException {
try {
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPath.compile("//ptk:PurchaseOrder//*");
Document documentXjdf = (Document) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);
} catch (Exception e) {
throw new WebApplicationException("PrintTalk document could not be deserialized.", e);
}
}
@覆盖
公共XJDF readFrom(
最终类类型,最终类型genericType,最终注释[]注释,最终MediaType MediaType,
最终多值映射多值映射,最终InputStream InputStream
)抛出IOException{
试一试{
DocumentBuilderFactory DocumentBuilderFactory=DocumentBuilderFactory.newInstance();
DocumentBuilder DocumentBuilder=documentBuilderFactory.newDocumentBuilder();
documentdocumentptk=documentBuilder.parse(新的InputSource(inputStream));
XPathFactory XPathFactory=XPathFactory.newInstance();
XPath=xPathFactory.newXPath();
XPathExpression XPathExpression=xPath.compile(“//ptk:PurchaseOrder//*”);
documentdocumentxjdf=(Document)xPathExpression.evaluate(documentPtk,XPathConstants.NODE);
}捕获(例外e){
抛出新的WebApplicationException(“无法反序列化PrintTalk文档。”,e);
}
}
这里要说明三个要点:
- 默认情况下,
不支持命名空间,在创建DocumentBuilderFactory
DocumentBuilder
- XPath不使用XML文档中的名称空间前缀映射,而是使用自己的
名称空间上下文
- 此查询返回的
将不是节点
,而是文档
元素
NamespaceContext
实现,因此您必须使用第三方实现(我通常使用),或者编写自己的接口实现
下面是一个使用SimpleNamespaceContext
的示例:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
nsCtx.bindNamespaceUri("p", "http://linkToNameSpace");
xPath.setNamespaceContext(nsCtx);
XPathExpression xPathExpression = xPath.compile("/p:PrintTalk/p:Request/p:PurchaseOrder/*");
Element documentXjdf = (Element) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);
您使用什么工具/语言/库来执行这些XPath表达式?你得到了什么输出?完全没有,没有XML标记的纯文本,或者其他什么?请记住XPath是一种用于从XML文件中选择节点的语言,一旦选择了这些节点(打印其字符串值、将其序列化为XML等),您将如何处理这些节点是另一个问题。我更新了我的问题以提供更多信息。我需要做的是从一个w3c文档中提取所需的内容,并从Java中提取的内容创建一个新的w3c文档。请发布您的Java代码。它解决了许多现成的问题(与名称空间、类型转换、对象映射、IO相关)。使用它可以节省大量代码。我非常感谢你的回答,伊恩!感谢您指出要实现的名称空间上下文,我以前不知道!由于返回的节点类型将是一个元素,因此如何将其转换为字节数组?imo,转换它将不起作用,因为DOMSource的构造函数有一个节点作为parameter@ArthurEirich您必须使用no op
转换器将其从DOMSource
序列化为StreamResult
,或者使用标准DOMLSSerializer
机制。但你确定这是必要的吗?接下来需要对XML做什么?如果您使用JAXB或XStream之类的工具将XML转换为对象,那么您应该能够直接从DOM解组。然后我需要从这个字节数组创建另一个java类。@ArthurEirich在这种情况下是的,Transformer
或LSSerializer
方法将工作,使用ByteArrayOutputStream
作为目标。
<xjdf:ProductList>
<xjdf:Product>
...
</xjdf:Product>
</xjdf:ProductList> (here the product tag is a child of product list tag)
<xjdf:ProuctList>
<xjdf:Product>
@Override
public XJDF readFrom(
final Class<XJDF> type, final Type genericType, final Annotation[] annotations, final MediaType mediaType,
final MultivaluedMap<String, String> multivaluedMap, final InputStream inputStream
) throws IOException {
try {
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPath.compile("//ptk:PurchaseOrder//*");
Document documentXjdf = (Document) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);
} catch (Exception e) {
throw new WebApplicationException("PrintTalk document could not be deserialized.", e);
}
}
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document documentPtk = documentBuilder.parse(new InputSource(inputStream));
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
SimpleNamespaceContext nsCtx = new SimpleNamespaceContext();
nsCtx.bindNamespaceUri("p", "http://linkToNameSpace");
xPath.setNamespaceContext(nsCtx);
XPathExpression xPathExpression = xPath.compile("/p:PrintTalk/p:Request/p:PurchaseOrder/*");
Element documentXjdf = (Element) xPathExpression.evaluate(documentPtk, XPathConstants.NODE);