Java StaX解析:Transformer.transform方法自动移动光标,但并不总是很好
我正在使用XMLStreamReader来实现我的目标(分割xml文件)。它看起来不错,但仍然没有达到预期的效果。我的目标是从输入文件中拆分每个节点“nextTag”:Java StaX解析:Transformer.transform方法自动移动光标,但并不总是很好,java,xml,xml-parsing,sax,stax,gradle,Java,Xml,Xml Parsing,Sax,Stax,Gradle,我正在使用XMLStreamReader来实现我的目标(分割xml文件)。它看起来不错,但仍然没有达到预期的效果。我的目标是从输入文件中拆分每个节点“nextTag”: <?xml version="1.0" encoding="UTF-8"?> <firstTag> <nextTag>1</nextTag> <nextTag>2</nextTag> </firstTag> 其实很简单。但是,我
<?xml version="1.0" encoding="UTF-8"?>
<firstTag>
<nextTag>1</nextTag>
<nextTag>2</nextTag>
</firstTag>
其实很简单。但是,我的输入文件的格式是从一行开始的:
<?xml version="1.0" encoding="UTF-8"?><firstTag><nextTag>1</nextTag><nextTag>2</nextTag></firstTag>
这是因为,在执行转换方法之后,光标将自动向前移动到下一个事件。在代码中,我有一个分数:
while (streamReader.hasNext()) {
streamReader.next();
...
t.transform(new StAXSource(streamReader), new StreamResult(writer));
...
}
在第一次转换之后,streamReader将直接获得2次next():
因此,对于这个特定的行XML,光标永远无法到达第二个打开的标记。
相反,如果输入XML有一个漂亮的打印表单,则可以从光标到达第二个表单,因为在第一个结束标记之后有一个空格事件
不幸的是,我找不到任何如何进行设置的方法,因此转换程序在执行转换方法后不会自动跳转到下一个事件。这太令人沮丧了
有人知道我该怎么处理吗?在语义上也是非常受欢迎的。非常感谢你
问候,
拉特纳
另外,我当然可以为这个问题编写一个解决方案(在转换xml文档之前漂亮地打印xml文档,但这意味着输入xml之前被修改过,这是不允许的)正如您所阐述的那样,如果元素节点彼此直接跟随,那么转换步骤是否继续到下一个创建元素 为了解决这个问题,您可以使用嵌套的while循环重写代码,如下所示:
<?xml version="1.0" encoding="UTF-8"?><nextTag>1</nextTag>
<?xml version="1.0" encoding="UTF-8"?><nextTag>2</nextTag>
while(reader.next() != XMLStreamConstants.END_DOCUMENT) {
while(reader.getEventType() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals("nextTag")) {
StringWriter writer = new StringWriter();
// will transform the current node to a String, moves the cursor to the next START_ELEMENT
t.transform(new StAXSource(reader), new StreamResult(writer));
System.out.println(writer.toString());
}
}
如果您的
xml
文件适合内存,您可以在JOOX
库的帮助下进行尝试,该库导入如下:
还有主要的课程,比如:
import java.io.File;
import java.io.IOException;
import org.joox.JOOX;
import org.joox.Match;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import static org.joox.JOOX.$;
public class Main {
public static void main(String[] args)
throws IOException, SAXException, TransformerException {
DocumentBuilder builder = JOOX.builder();
Document document = builder.parse(new File(args[0]));
Transformer transformer =
TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "no");
final Match $m = $(document);
$m.find("nextTag").forEach(tag -> {
try {
transformer.transform(
new DOMSource(tag),
new StreamResult(System.out));
System.out.println();
}
catch (TransformerException e) {
System.exit(1);
}
});
}
}
它产生:
<?xml version="1.0" encoding="UTF-8"?><nextTag>1</nextTag>
<?xml version="1.0" encoding="UTF-8"?><nextTag>2</nextTag>
1
2.
你能试着摆脱BufferedReader
和InputStreamReaders
吗。在某些情况下,它们会破坏编码,可能会把新词弄乱。哈罗·阿特布里斯托尔,我刚刚测试过,但仍然没有改变任何东西-(这还不够!在尝试了几个小时的其他解决方案之后,这是第一个可以处理标记之间有空格和没有空格的XML的解决方案。非常感谢!
1. from the transform method
2. from the next method in the while loop
while(reader.next() != XMLStreamConstants.END_DOCUMENT) {
while(reader.getEventType() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals("nextTag")) {
StringWriter writer = new StringWriter();
// will transform the current node to a String, moves the cursor to the next START_ELEMENT
t.transform(new StAXSource(reader), new StreamResult(writer));
System.out.println(writer.toString());
}
}
compile 'org.jooq:joox:1.3.0'
import java.io.File;
import java.io.IOException;
import org.joox.JOOX;
import org.joox.Match;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import static org.joox.JOOX.$;
public class Main {
public static void main(String[] args)
throws IOException, SAXException, TransformerException {
DocumentBuilder builder = JOOX.builder();
Document document = builder.parse(new File(args[0]));
Transformer transformer =
TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "no");
final Match $m = $(document);
$m.find("nextTag").forEach(tag -> {
try {
transformer.transform(
new DOMSource(tag),
new StreamResult(System.out));
System.out.println();
}
catch (TransformerException e) {
System.exit(1);
}
});
}
}
<?xml version="1.0" encoding="UTF-8"?><nextTag>1</nextTag>
<?xml version="1.0" encoding="UTF-8"?><nextTag>2</nextTag>