Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何在XML解析器引发MalformedByteSequenceException后定位错误_Java_Xml_Utf 8_Xml Parsing - Fatal编程技术网

Java 如何在XML解析器引发MalformedByteSequenceException后定位错误

Java 如何在XML解析器引发MalformedByteSequenceException后定位错误,java,xml,utf-8,xml-parsing,Java,Xml,Utf 8,Xml Parsing,我在解析XML文件时遇到格式错误的DbyteSequenceException 我的应用程序允许外部客户提交XML文件。他们可以使用任何支持的编码,但大多数人根据提供给他们的示例在文件顶部指定…encoding=“UTF-8”…。但有些人会使用windows-1252对数据进行编码,这将导致非ascii字符的格式错误的ByteSequenceException 我想使用XML解析器来识别文件编码并解码文件,所以我不想有一个测试编码或将InputStream转换为读取器的初步步骤。我觉得XML解析

我在解析XML文件时遇到格式错误的DbyteSequenceException

我的应用程序允许外部客户提交XML文件。他们可以使用任何支持的编码,但大多数人根据提供给他们的示例在文件顶部指定
…encoding=“UTF-8”…
。但有些人会使用windows-1252对数据进行编码,这将导致非ascii字符的格式错误的ByteSequenceException

我想使用XML解析器来识别文件编码并解码文件,所以我不想有一个测试编码或将InputStream转换为读取器的初步步骤。我觉得XML解析器应该处理这个步骤

尽管我已经声明了ValidationEventHandler,但在出现格式错误的DbyteSequenceException时不会调用它

有没有办法让解组器报告文件中发生错误的位置

以下是我的Java代码:

InputStream input = ...
JAXBContext jc = JAXBContext.newInstance(MyClass.class.getPackage().getName());
Unmarshaller unmarshaller = jc.createUnmarshaller();
SchemaFactory sf = SchemaFactory.newInstance(javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI);
Source source = new StreamSource(getClass().getResource("my.xsd").toExternalForm());
Schema schema = sf.newSchema(sources);
unmarshaller.setSchema(schema);
ValidationEventHandler handler = new MyValidationEventHandler();
unmarshaller.setEventHandler(handler);
MyClass myClass = (MyClass) unmarshaller.unmarshal(input);
以及生成的堆栈跟踪

javax.xml.bind.UnmarshalException
 - with linked exception:
[com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 4-byte UTF-8 sequence.]
        at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:202)
        at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:173)
        at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:137)
        at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(AbstractUnmarshallerImpl.java:184)
        at (my code)
Caused by: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 4-byte UTF-8 sequence.
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:470)
        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanContent(XMLEntityScanner.java:916)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2788)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
        at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
        at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
        at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
        at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
        at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
        at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:200)
        ... 51 more

我还没有测试,但我会的

  • 使用SAXSource(javax.xml.transform.sax.SAXSource)而不是StreamSource
  • 关联到SAXSource我自己的org.xml.sax.ErrorHandler实现(SAXSource.getXMLReader().setErrorHandler)

这样,我会得到SAXParseException的通知,其中有解析错误的位置。

+1:好主意。但是当我听从你的建议时,我得到了
SAXParseException:src resolve:无法将名称'blah:blah'解析为(n)'type definition'组件。
google这表明我需要使用xerces库,而不是默认的JVM 1.6 xerces实现。这会带来更多的麻烦,所以我想我现在还得面对这个问题。