Java Groovys XmlParser忽略CDATA CR/CL

Java Groovys XmlParser忽略CDATA CR/CL,java,xml,groovy,log4j,Java,Xml,Groovy,Log4j,我想解析log4j生成的xml日志。xml中有一个可丢弃的节点(如果有的话)。此(多行、选项卡式)文本封装在CDATA标记中 这是整个文件的摘录: <log4j:event logger="org.codehaus.groovy.grails.web.errors.GrailsExceptionResolver" timestamp="1330083921521" level="ERROR" thread="http-8080-1"> <log4j:message><

我想解析log4j生成的xml日志。xml中有一个可丢弃的节点(如果有的话)。此(多行、选项卡式)文本封装在CDATA标记中

这是整个文件的摘录:

<log4j:event logger="org.codehaus.groovy.grails.web.errors.GrailsExceptionResolver" timestamp="1330083921521" level="ERROR" thread="http-8080-1">
<log4j:message><![CDATA[Exception occurred when processing request: [GET] /test/log/show
Stacktrace follows:]]></log4j:message>
<log4j:throwable><![CDATA[org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
    at test.LogController$_closure2.doCall(LogController.groovy:21)
    at test.LogController$_closure2.doCall(LogController.groovy)
    at java.lang.Thread.run(Thread.java:662)
]]></log4j:throwable>
</log4j:event>
“可丢弃”字段包含所有文本,但不包含CR/LF

有人知道如何应对吗


感谢advcance…

xml标准要求在解析过程中规范化空白


我不确定,但解析器可能有一个覆盖此行为的设置。否则,您可以预处理文件,将c数据段中的行尾替换为它们的xml实体等价物,然后对其进行解析。

xml标准要求在解析过程中规范化空白


我不确定,但解析器可能有一个覆盖此行为的设置。否则,您可以对文件进行预处理,将c数据段中的行尾替换为它们的xml实体等价物,然后对其进行解析。

我不想直接向您抛出代码,但它似乎按预期工作,并返回CRLFs

def xml = '''<log>
            |  <log4j:event logger="org.codehaus.groovy.grails.web.errors.GrailsExceptionResolver" timestamp="1330083921521" level="ERROR" thread="http-8080-1">
            |    <log4j:message><![CDATA[Exception occurred when processing request: [GET] /test/log/show
            |Stacktrace follows:]]></log4j:message>
            |    <log4j:throwable><![CDATA[org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
            |    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
            |    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
            |    at test.LogController$_closure2.doCall(LogController.groovy:21)
            |    at test.LogController$_closure2.doCall(LogController.groovy)
            |    at java.lang.Thread.run(Thread.java:662)
            |]]></log4j:throwable>
            |  </log4j:event>
            |</log>'''.stripMargin()


class LogEntry {
  def level
  def message
  def thread
  def logger
  def timestamp
  def throwable

  String toString() {
    """EVENT:
      |  level   : $level
      |  message : $message
      |  thread  : $thread
      |  logger  : $logger
      |  ts      : $timestamp
      |  thrown  : $throwable""".stripMargin()
  }
}

def parser = new XmlParser(false, false).parseText( xml )
def entries = parser.'log4j:event'.collect { event ->
  new LogEntry().with {
    level     = event.@level
    message   = event.'log4j:message'.text()
    thread    = event.@thread
    logger    = event.@logger
    timestamp = new Date( event.@timestamp as long )
    throwable = event.'log4j:throwable'?.text() ?: ''
    it
  }
}

entries.each {
  println it
}
里面有CRLF字符,它们应该在哪里


这是与Groovy 1.8.6 btw。。。你用的是什么版本?你能升级后再试一次吗?

我讨厌向你扔代码,但它似乎能按预期工作并返回CRLFs

def xml = '''<log>
            |  <log4j:event logger="org.codehaus.groovy.grails.web.errors.GrailsExceptionResolver" timestamp="1330083921521" level="ERROR" thread="http-8080-1">
            |    <log4j:message><![CDATA[Exception occurred when processing request: [GET] /test/log/show
            |Stacktrace follows:]]></log4j:message>
            |    <log4j:throwable><![CDATA[org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
            |    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
            |    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
            |    at test.LogController$_closure2.doCall(LogController.groovy:21)
            |    at test.LogController$_closure2.doCall(LogController.groovy)
            |    at java.lang.Thread.run(Thread.java:662)
            |]]></log4j:throwable>
            |  </log4j:event>
            |</log>'''.stripMargin()


class LogEntry {
  def level
  def message
  def thread
  def logger
  def timestamp
  def throwable

  String toString() {
    """EVENT:
      |  level   : $level
      |  message : $message
      |  thread  : $thread
      |  logger  : $logger
      |  ts      : $timestamp
      |  thrown  : $throwable""".stripMargin()
  }
}

def parser = new XmlParser(false, false).parseText( xml )
def entries = parser.'log4j:event'.collect { event ->
  new LogEntry().with {
    level     = event.@level
    message   = event.'log4j:message'.text()
    thread    = event.@thread
    logger    = event.@logger
    timestamp = new Date( event.@timestamp as long )
    throwable = event.'log4j:throwable'?.text() ?: ''
    it
  }
}

entries.each {
  println it
}
里面有CRLF字符,它们应该在哪里


这是与Groovy 1.8.6 btw。。。你用的是什么版本?你能升级并重试吗?

你有一个小的XML示例吗?我编辑了这篇文章以展示一个小示例…你有一个小的XML示例吗?我编辑了这篇文章以展示一个小示例…-1。XML标准不要求规范化除属性中的空格外的空格。-1。XML标准不要求规范化除了attributes.Hm中的空白。是的,在工作中我使用1.7.10(在Grails上)。用1.8.6测试它,它按预期工作。好的。似乎在我的控制器和视图之间,我错过了转换选项卡等…嗯。是的,在工作中我使用1.7.10(在Grails上)。用1.8.6测试它,它按预期工作。好的。似乎在我的控制器和视图之间,我错过了转换选项卡等。。。
EVENT:
  level   : ERROR
  message : Exception occurred when processing request: [GET] /test/log/show
Stacktrace follows:
  thread  : http-8080-1
  logger  : org.codehaus.groovy.grails.web.errors.GrailsExceptionResolver
  ts      : Fri Feb 24 11:45:21 GMT 2012
  thrown  : org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
    at test.LogController$_closure2.doCall(LogController.groovy:21)
    at test.LogController$_closure2.doCall(LogController.groovy)
    at java.lang.Thread.run(Thread.java:662)