可以使用哪些方法从Java文件中返回有效和无效的XML数据？_Java_Xml_Xslt_Jaxb_Xquery

可以使用哪些方法从Java文件中返回有效和无效的XML数据？

java xml xslt jaxb xquery

可以使用哪些方法从Java文件中返回有效和无效的XML数据？,java,xml,xslt,jaxb,xquery,Java,Xml,Xslt,Jaxb,Xquery,我有以下应该是XML的数据： <?xml version="1.0" encoding="UTF-8"?> <Product> <id>1</id> <description>A new product</description> <price>123.45</price> </Product> <Product> <id>1<

我有以下应该是XML的数据：

<?xml version="1.0" encoding="UTF-8"?>
<Product>
    <id>1</id>
    <description>A new product</description>
    <price>123.45</price>
</Product>

<Product>
    <id>1</id>
    <description>A new product</description>
    <price>123.45</price>
</Product>

<ProductTTTTT>
    <id>1</id>
    <description>A new product</description>
    <price>123.45</price>
</Product>

<Product>
    <id>1</id>
    <description>A new product</description>
    <price>123.45</price>
</ProductAAAAAA>

无效节点：

..

和

..

然后我在思考如何使用JAVA（而不是web）实现这一点

如果我没有错，那么用XSD验证它将使整个文件无效，因此不是一个选项
使用默认的JAXB解析器（解组器）将导致上面的项，因为它在内部创建了我的实体的XSD
使用XPathJust（据我所知）只会返回整个文件，我没有找到一种方法来获得像get这样的东西！有效（只是为了解释…）
使用XQuery（可能？）。。顺便问一下，如何将XQuery与JAXB结合使用
XSL（T）将在XPath上产生同样的结果，因为它使用XPath来选择内容

所以。。。我可以使用哪种方法来实现目标？（如果可能，请提供链接或代码）

首先，您混淆了有效和格式正确。你说你想找到无效的元素，但是你的例子不仅仅是无效的，它们的格式是错误的。这意味着XML解析器除了向您抛出错误消息之外，不会对它们做任何事情。不能使用JAXB、XPath、XQuery、XSLT或任何东西来处理非XML的内容

您会说“很遗憾，我无法访问发送此xml格式的系统”。我不知道为什么称它为XML格式：它不是。我也不明白为什么你（和StackOverflow上的许多其他人）准备花时间像这样挖垃圾，而不是告诉发件人一起行动。如果给你一份有蛆的沙拉，你会试着把它们挑出来，还是把它送回去替换？你应该对坏数据采取零容忍的方法；这是发件人学习提高质量的唯一方法。

如果文件中包含以“产品”开头的带有开始和结束标记的行，您可以：

使用文件扫描程序将此文档拆分为各个部分每当一行以

开头时，根据定义，XML文件只能有一个。正如@MickMnemonic所说，将XML清理为只有一个根元素将解决您的一些（全部？）问题。我理解并同意你们两人的看法，但不幸的是，我无法访问发送这种xml格式的系统。还有其他选择吗？当你被告知它不是XML格式时，为什么要将它称为XML格式？这只是一种引用“东西”的方式……我完全理解并同意你的意见（我来这里之前已经尝试过），他们应该为我发送好的数据，但不幸的是，我没有这个选择。所以，让我们面对现实吧，否则我会按照老板的要求去做，或者被炒鱿鱼，就像这样简单。这有点令人沮丧。。。无论如何，对不起，伙计，但我不能接受这个问题的答案。你的答案讲述了一个非常有趣和重要的观点，每个开发人员都应该这样做，所以我把它标记为有用的。谢谢工程师们按照老板（或客户）的要求毫无疑问地去做，结果却要为格伦菲尔这样的灾难负责。嗨@Mads Hansen，谢谢你在这里帮助我。。。有了您（和其他人）的答案，我可以理解这并没有正确的方法，因为它不是xml。不管怎样，我在想你写的东西。唯一需要注意的是，你只是在验证你也可以调整该策略来规范这些结束元素，因此如果它以开始，请确保不仅是你，而且@Michael Kay帮助了我，即使有这种攻击性，我也理解他的意图。
<Product>
   ...
</Product>

package com.stackoverflow.questions.52012383;

import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.StringReader;

import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

public class FileSplitter {

    public static void parseFile(File file, String elementName) 
      throws ParserConfigurationException, IOException {

        List<Document> good = new ArrayList<>();
        List<String> bad = new ArrayList<>();

        String start-tag = "<" + elementName;
        String end-tag = "</" + elementName;
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder;
        StringBuffer buffer = new StringBuffer();
        String line;
        boolean append = false;

        try (Scanner scanner = new Scanner(file)) {
            while (scanner.hasNextLine()) {
                line = scanner.nextLine();

                if (line.startsWith(startTag)) {
                    append = true; //start accumulating content
                } else if (line.startsWith(endTag)) {
                    append = false;
                    buffer.append(line); 
                    //instead of the line above, you could hard-code the ending tag to compensate for bad data:
                    // buffer.append(endTag + ">");

                    try { // to parse as XML
                        builder = factory.newDocumentBuilder();
                        Document document = builder.parse(new InputSource(new StringReader(buffer.toString())));
                        good.add(document); // parsed successfully, add it to the good list

                        buffer.setLength(0); //reset the buffer to start a new XML doc

                    } catch (SAXException ex) {
                        bad.add(buffer.toString()); // something is wrong, not well-formed XML
                    }
                }

                if (append) { // accumulate content
                    buffer.append(line);
                }
            }
            System.out.println("Good items: " + good.size() + " Bad items: " + bad.size());
            //do stuff with the good/bad results...
        }
    }

    public static void main(String args[]) 
      throws ParserConfigurationException, IOException {
        File file = new File("/tmp/test.xml");
        parseFile(file, "Product");
    }

}