C# XML Unicode标识符/.Net支持

C# XML Unicode标识符/.Net支持,c#,.net,xml,unicode,C#,.net,Xml,Unicode,我有这个xml文档: <testYou may be out of luck using the current .Net version. According to the documentation for XmlReader in .Net 4.5: XmlReader provides forward-only, read-only access to XML data in adocument or stream. This class conforms to the

我有这个xml文档:

<testYou may be out of luck using the current .Net version.

According to the documentation for
XmlReader
in .Net 4.5
:

XmlReader provides forward-only, read-only access to XML data in adocument or stream. This class conforms to the W3C Extensible Markup Language (XML) 1.0 (fourth edition) and the Namespaces in XML 1.0 (third edition) recommendations.

And it appears that in this edition, characters like yours outside the basic multilingual plane are not valid for element names. Your character is
0xD863 0xDCD2
in UTF-16, and from the Fouth edition requirements for valid element name characters there are no valid name characters whose code point value is larger than
#xD7A3
. This is less than the value
#xD800
where surrogate pair encodings begin - and much less than your character
#x28CD2
.

To confirm, from the wikipedia article on XML:

XML 1.0 (Fifth Edition) and XML 1.1 support the direct use of almost any Unicode character in element names, attributes, comments, character data, and processing instructions (other than the ones that have special symbolic meaning in XML itself, such as the less-than sign, "<"). The following is a well-formed XML document including Chinese, Armenian and Cyrillic characters:

<?xml version="1.0" encoding="UTF-8"?>
<俄语 լեզու="ռուսերեն">данные</俄语>

您使用当前的.Net版本可能运气不好

根据报告:

XmlReader提供对文档或流中XML数据的前向只读访问。此类符合W3C和建议

在这个版本中,像外部的字符对于元素名是无效的。在UTF-16中,您的字符是
0xD863 0xDCD2
,并且从中没有代码点值大于
#xD7A3
的有效名称字符。这小于where的值
#xD800
,并且远小于您的字符
#x28CD2

要确认,请从以下位置:


XML 1.0(第五版)和XML 1.1支持在元素名称、属性、注释、字符数据和处理指令中直接使用几乎任何Unicode字符(XML本身具有特殊符号含义的字符除外,例如小于号、,“使用当前的.Net版本可能会运气不佳

根据报告:

XmlReader提供对文档或流中XML数据的前向只读访问。此类符合W3C和建议

在这个版本中,像外部的字符对于元素名称是无效的。您的字符在UTF-16中是
0xD863 0xDCD2
,并且从中看,没有任何有效的名称字符的代码点值大于
#xD7A3
。这小于
#xD800
其中的值,并且比您的字符
#x28CD2

要确认,请从以下位置:

XML 1.0(第五版)和XML 1.1支持在元素名称、属性、注释、字符数据和处理指令中直接使用几乎任何Unicode字符(XML本身具有特殊符号含义的字符除外,例如小于号)
            unsafe {
#if SILVERLIGHT
                if ( xmlCharType.IsStartNCNameSingleChar( chars[pos] ) ) {
#else // Optimization due to the lack of inlining when a method uses byte*
                if ( ( xmlCharType.charProperties[chars[pos]] & XmlCharType.fNCStartNameSC ) != 0 ) {
#endif
                    pos++;
                }
#if XML10_FIFTH_EDITION
                else if ( pos + 1 < ps.charsUsed && xmlCharType.IsNCNameSurrogateChar(chars[pos + 1], chars[pos])) {
                    pos += 2;
                }
#endif
                else {
                    goto ParseQNameSlow;
                }
            }