奇怪的字符()行为与SAX+;JAVA
在我的XML中有一个多行元素:奇怪的字符()行为与SAX+;JAVA,java,xml,parsing,character,sax,Java,Xml,Parsing,Character,Sax,在我的XML中有一个多行元素: <tag id="sometag" ...> | first line | second line | third line | fourth line <tag ...> .... <tag id="someothertag" ...> | ANOTHER FIRST LINE | ANOTHER SECOND LINE |
<tag id="sometag" ...>
| first line
| second line
| third line
| fourth line
<tag ...>
....
<tag id="someothertag" ...>
| ANOTHER FIRST LINE
| ANOTHER SECOND LINE
| ANOTHER THIRD LINE
| ANOTHER FORTH LINE
<tag ...>
除此之外,我对这些角色什么也不做。我基本上创建了两个解析器实例。有一个例子,我正在搜索sometag
。如果找到要查找的内容并返回该元素,则抛出异常
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "sometag"!
…对于另一个全新的实例,我正在搜索someothertag
。我做了和以前一样的事
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "someothertag"!
我知道XML解析是基于流的(它解析块而不是整个字符串),但这是一种非常奇怪的行为。以下是我注意到的一些令人困惑的事情:
- 对于characters()的每次迭代,解析器并不是从它停止的地方开始,也不是在完成解析时完成字符:我甚至得到了第一个字符数组(“
”,它是换行符的替换)之前的字符n
具有原本不存在的额外字符:“ch
”附加到“行
”第四行
- 当我创建一个全新的解析器实例时,字符被“重新读取”。第二次执行应该是这样的:
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | ANOTHER FIRST LINE", 0, 20 )
D/MyProgram( 1565): characters( "n | ANOTHER SECOND LINE", 0, 1 )
。。。等等
知道我做错了什么吗?提前感谢。正如Margulies所说,您没有在传递的字符数组中使用
start
和length
public void characters(char[] ch, int start, int length) {
// use only the indicated segment.
String str = new String( ch, start, length);
Log.d(TAG, "characters( "\"" + str.replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + " )");
}
看起来您没有考虑开始和长度。我遇到的另一个问题是解析器的字符串生成器是静态的。我需要使用builder.setLength()重置它。
public void characters(char[] ch, int start, int length) {
// use only the indicated segment.
String str = new String( ch, start, length);
Log.d(TAG, "characters( "\"" + str.replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + " )");
}