XML(反)序列化无效字符串在c#中不一致?
在C#(.net 4.0和4.5/vs2010和vs12)中,当我使用XMLSerializer序列化包含非法字符字符串的对象时,不会引发任何错误。但是,当我反序列化该结果时,会抛出一个“无效字符”错误XML(反)序列化无效字符串在c#中不一致?,c#,xml-serialization,xmlserializer,illegal-characters,C#,Xml Serialization,Xmlserializer,Illegal Characters,在C#(.net 4.0和4.5/vs2010和vs12)中,当我使用XMLSerializer序列化包含非法字符字符串的对象时,不会引发任何错误。但是,当我反序列化该结果时,会抛出一个“无效字符”错误 // add to XML Items items = new Items(); items.Item = "\v hello world"; // contains "illegal" character \v // varia
// add to XML
Items items = new Items();
items.Item = "\v hello world"; // contains "illegal" character \v
// variables
System.Xml.Serialization.XmlSerializer serializer = new System.Xml.Serialization.XmlSerializer(typeof(Items));
string tmpFile = Path.GetTempFileName();
// serialize
using (FileStream tmpFileStream = new FileStream(tmpFile, FileMode.Open, FileAccess.ReadWrite))
{
serializer.Serialize(tmpFileStream, items);
}
Console.WriteLine("Success! XML serialized in file " + tmpFile);
// deserialize
Items result = null;
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
result = (Items)serializer.Deserialize(plainTextFile); //FAILS here
}
Console.WriteLine(result.Item);
“Items”只是由xsd/c Items.xsd自动生成的一个小类。Items.xsd只不过是一个根元素(Items),其中包含一个子元素(Item):
反序列化期间引发的错误为
未处理的异常:System.InvalidOperationException:存在
XML文档中出现错误(3,12)。-->System.Xml.XmlException:'♂',
十六进制值0x0B是无效字符。第3行,位置12
序列化的XML文件第3行包含以下内容:
<Item> hello world</Item>
&xB;你好,世界
我知道;是非法字符,但是为什么XMLSerialize允许序列化它(没有错误)?我发现它与.NET不一致,它允许我毫无问题地序列化某些内容,但却发现我无法对其进行反序列化
是否有一种解决方案可以让XMLSerializer在序列化之前自动删除非法字符,或者我可以指示反序列化忽略非法字符
目前,我确实通过将文件内容读取为字符串来解决它,替换“手动”非法字符,然后反序列化它。。。但我发现这是一个丑陋的黑客/工作环境。1.
您可以设置XmlWriterSettings
的CheckCharacters
属性以避免写入非法字符。(Serialize
方法将引发异常)
2.
您可以创建自己的XmlTextWriter,在序列化时过滤掉不需要的字符
using (FileStream tmpFileStream = new FileStream(tmpFile, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
var writer = new MyXmlWriter(tmpFileStream);
serializer.Serialize(writer, items);
}
public class MyXmlWriter : XmlTextWriter
{
public MyXmlWriter(Stream s) : base(s, Encoding.UTF8)
{
}
public override void WriteString(string text)
{
string newText = String.Join("", text.Where(c => !char.IsControl(c)));
base.WriteString(newText);
}
}
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
var reader = new MyXmlReader(plainTextFile);
result = (SomeObject)serializer.Deserialize(reader);
}
public class MyXmlReader : XmlTextReader
{
public MyXmlReader(Stream s) : base(s)
{
}
public override string ReadString()
{
string text = base.ReadString();
string newText = String.Join("", text.Where(c => !char.IsControl(c)));
return newText;
}
}
3.
通过创建自己的XmlTextReader,您可以在反序列化时过滤掉不需要的字符
using (FileStream tmpFileStream = new FileStream(tmpFile, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
var writer = new MyXmlWriter(tmpFileStream);
serializer.Serialize(writer, items);
}
public class MyXmlWriter : XmlTextWriter
{
public MyXmlWriter(Stream s) : base(s, Encoding.UTF8)
{
}
public override void WriteString(string text)
{
string newText = String.Join("", text.Where(c => !char.IsControl(c)));
base.WriteString(newText);
}
}
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
var reader = new MyXmlReader(plainTextFile);
result = (SomeObject)serializer.Deserialize(reader);
}
public class MyXmlReader : XmlTextReader
{
public MyXmlReader(Stream s) : base(s)
{
}
public override string ReadString()
{
string text = base.ReadString();
string newText = String.Join("", text.Where(c => !char.IsControl(c)));
return newText;
}
}
4.
您可以将XmlReaderSettings的CheckCharacters属性设置为false。反序列化现在将顺利进行。(您将获得\v
返回。)
你可能需要检查一下
using (FileStream plainTextFile = new FileStream(tmpFile, FileMode.Open, FileAccess.Read))
{
var reader = XmlReader.Create(plainTextFile, new XmlReaderSettings() { CheckCharacters = false });
result = (SomeObject)serializer.Deserialize(reader);
}