Java 根据孙子的id将XML拆分为更小的块
我有一个xml,它应该被惟一的BookId节点分割成更小的块。基本上,我需要将每本书过滤成单独的xml,其结构与初始xml相同 其目的是-要求根据XSD验证每个较小的XML,以确定哪个Book/PendingBook无效 请注意,Books节点可以同时包含Book和PendingBook节点 初始XML:Java 根据孙子的id将XML拆分为更小的块,java,xml,Java,Xml,我有一个xml,它应该被惟一的BookId节点分割成更小的块。基本上,我需要将每本书过滤成单独的xml,其结构与初始xml相同 其目的是-要求根据XSD验证每个较小的XML,以确定哪个Book/PendingBook无效 请注意,Books节点可以同时包含Book和PendingBook节点 初始XML: <Main xmlns="http://some/url/name"> <Books> <Book> <
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
2021
001
2021-05-10T12:35:00
2020
002
2021-05-10T12:35:00
2020
003
2021-05-10T12:35:00
...
结果应该与下一个xmls类似:
Book_001.xml(BookId=001):
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
2021
001
2021-05-10T12:35:00
...
Book_002.xml(BookId=002):
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
2020
002
2021-05-10T12:35:00
...
PendingBook_003.xml(BookId=003):
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>001</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<Book>
<IdentifyingInformation>
<ID>
<Year>2020</Year>
<BookId>002</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</Book>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
<Main xmlns="http://some/url/name">
<Books>
<PendingBook>
<IdentifyingInformation>
<ID>
<Year>2021</Year>
<BookId>003</BookId>
<BookDateTime>2021-05-10T12:35:00</BookDateTime>
</ID>
</IdentifyingInformation>
</PendingBook>
<OtherInfo>...</OtherInfo>
</Books>
</Main>
2021
003
2021-05-10T12:35:00
...
到目前为止,我只将每个ID节点提取到更小的XML中。并手动创建根元素
理想情况下,我希望复制初始xml中的所有元素,并将其放入Books节点的单个Book/PendingBook节点中
我的java示例:
package com.main;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ExtractXmls {
/**
* @param args
*/
public static void main(String[] args) throws Exception
{
String inputFile = "C:/pathToXML/Main.xml";
File xmlFile = new File(inputFile);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true); // never forget this!
XPathFactory xfactory = XPathFactory.newInstance();
XPath xpath = xfactory.newXPath();
XPathExpression allBookIdsExpression = xpath.compile("//Books/*/IdentifyingInformation/ID/BookId/text()");
NodeList bookIdNodes = (NodeList) allBookIdsExpression.evaluate(doc, XPathConstants.NODESET);
//Save all the products
List<String> bookIds = new ArrayList<>();
for (int i = 0; i < bookIdNodes.getLength(); ++i) {
Node bookId = bookIdNodes.item(i);
System.out.println(bookId.getTextContent());
bookIds.add(bookId.getTextContent());
}
//Now we create and save split XMLs
for (String bookId : bookIds)
{
//With such query I can find node based on bookId
String xpathQuery = "//ID[BookId='" + bookId + "']";
xpath = xfactory.newXPath();
XPathExpression query = xpath.compile(xpathQuery);
NodeList bookIdNodesFiltered = (NodeList) query.evaluate(doc, XPathConstants.NODESET);
System.out.println("Found " + bookIdNodesFiltered.getLength() + " bookId(s) for bookId " + bookId);
//We store the new XML file in bookId.xml e.g. 001.xml
Document aamcIdXml = dBuilder.newDocument();
Element root = aamcIdXml.createElement("Main"); //Here I'm recreating root element (don't know if I can avoid it and copy somehow structure of initial xml)
aamcIdXml.appendChild(root);
for (int i = 0; i < bookIdNodesFiltered.getLength(); i++) {
Node node = bookIdNodesFiltered.item(i);
Node copyNode = aamcIdXml.importNode(node, true);
root.appendChild(copyNode);
}
//At the end, we save the file XML on disk
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(aamcIdXml);
StreamResult result = new StreamResult(new File("C:/pathToXML/" + bookId.trim() + ".xml"));
transformer.transform(source, result);
System.out.println("Done for " + bookId);
}
}
}
package.com.main;
导入java.io.File;
导入java.util.ArrayList;
导入java.util.List;
导入javax.xml.parsers.DocumentBuilder;
导入javax.xml.parsers.DocumentBuilderFactory;
导入javax.xml.transform.OutputKeys;
导入javax.xml.transform.Transformer;
导入javax.xml.transform.TransformerFactory;
导入javax.xml.transform.dom.DOMSource;
导入javax.xml.transform.stream.StreamResult;
导入javax.xml.xpath.xpath;
导入javax.xml.xpath.XPathConstants;
导入javax.xml.xpath.XPathExpression;
导入javax.xml.xpath.XPathFactory;
导入org.w3c.dom.Document;
导入org.w3c.dom.Element;
导入org.w3c.dom.Node;
导入org.w3c.dom.NodeList;
公共类抽取XML{
/**
*@param args
*/
公共静态void main(字符串[]args)引发异常
{
字符串inputFile=“C:/pathToXML/Main.xml”;
文件xmlFile=新文件(inputFile);
DocumentBuilderFactory dbFactory=DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder=dbFactory.newDocumentBuilder();
Document doc=dBuilder.parse(xmlFile);
DocumentBuilderFactory工厂=DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);//永远不要忘记这一点!
XPathFactory xfactory=XPathFactory.newInstance();
XPath=xfactory.newXPath();
XPathExpression allBookIdsExpression=xpath.compile(“//Books/*/identificationinformation/ID/BookId/text()”;
NodeList bookIdNodes=(NodeList)allbookidExpression.evaluate(doc,XPathConstants.NODESET);
//保存所有产品
List bookIds=new ArrayList();
对于(int i=0;i
你几乎可以让它工作了。您可以在循环中更改XPath,迭代图书ID以获取book
或PendingBook
元素,然后使用它。此外,除了Main
之外,还需要创建Books
元素,并将Book
或PendingBook
附加到新创建的Books
元素
XPath是://祖先::*[IdentificationInformation/ID/BookId=BookId]
它获取bookId与当前迭代中ID匹配的元素的祖先,即Book
或PendingBook
元素
//Now we create and save split XMLs
for (String bookId : bookIds)
{
//With such query I can find node based on bookId
String xpathQuery = "//ancestor::*[IdentifyingInformation/ID/BookId=" + bookId + "]";
xpath = xfactory.newXPath();
XPathExpression query = xpath.compile(xpathQuery);
NodeList bookIdNodesFiltered = (NodeList) query.evaluate(doc, XPathConstants.NODESET);
System.out.println("Found " + bookIdNodesFiltered.getLength() + " bookId(s) for bookId " + bookId);
//We store the new XML file in bookId.xml e.g. 001.xml
Document aamcIdXml = dBuilder.newDocument();
Element root = aamcIdXml.createElement("Main");
Element booksNode = aamcIdXml.createElement("Books");
root.appendChild(booksNode);
//Here I'm recreating root element (don't know if I can avoid it and copy somehow structure of initial xml)
aamcIdXml.appendChild(root);
String bookName = "";
for (int i = 0; i < bookIdNodesFiltered.getLength(); i++) {
Node node = bookIdNodesFiltered.item(i);
Node copyNode = aamcIdXml.importNode(node, true);
bookName = copyNode.getNodeName();
booksNode.appendChild(copyNode);
}
//At the end, we save the file XML on disk
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(aamcIdXml);
StreamResult result = new StreamResult(new File(bookName + "_" + bookId.trim() + ".xml"));
transformer.transform(source, result);
System.out.println("Done for " + bookId);
}
//现在我们创建并保存拆分XML
用于(字符串bookId:bookIds)
{