C# 将非转义unicode字符串转换为unicode_C#_Unicode

C# 将非转义unicode字符串转换为unicode

c# unicode

C# 将非转义unicode字符串转换为unicode,c#,unicode,C#,Unicode,我有一个mysql数据库中的文本字符串 var str = "u0393u03a5u039du0391u0399u039au0391". 我想替换unicode字符以显示它们实际显示的“ΓΥΝΑΚ”。如果在.net中使用\u手动转义u，则转换将自动完成我发现了以下功能： byte[] unicodeBytes = Encoding.Unicode.GetBytes(str); // Perform the conversion from one encoding to the other.

我有一个mysql数据库中的文本字符串

var str = "u0393u03a5u039du0391u0399u039au0391".

我想替换unicode字符以显示它们实际显示的“ΓΥΝΑΚ”。如果在.net中使用\u手动转义u，则转换将自动完成

我发现了以下功能：

byte[] unicodeBytes = Encoding.Unicode.GetBytes(str);

// Perform the conversion from one encoding to the other.
byte[] ascibytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);

// Convert the new byte[] into a char[] and then into a string.
char[] asciiChars = new char[Encoding.ASCII.GetCharCount(ascibytes, 0, ascibytes.Length)];

Encoding.ASCII.GetChars(ascibytes, 0, ascibytes.Length, asciiChars, 0);
return new string(asciiChars);

但既然它必须逃走，我就逃走了

str =str.Replace("u", @"\u")

但是没有运气。我如何转换它呢？

这些基本上是UTF-16代码点，所以这就可以了（这种方法不是很有效，但我假设优化不是主要目标）：

这无法处理字符串中未转义的“常规”字符的歧义：

dufface

将有效地转换为

\uffac

，这可能是不对的。它将正确处理代理，不过（

ud83dudc96

是另一种方式：

var str = "u0393u03a5u039du0391u0399u039au0391";

if (str.Length > 0 && str[0] == 'u')
    str = str.Substring(1, str.Length - 1);

string chars = string.Concat(str.Split('u').Select(s => 
    Convert.ToChar(Convert.ToInt32("0x" + s,16))));

这些无转义序列是否有可能与常规字符混合在一起？特别是，

？这会使这变得更加困难。不，为什么代码中没有ASCII？大写希腊字母不是ASCII的一部分。第二次尝试不起作用，因为反斜杠语法无法解释d当它已经是存储在内存中的C#字符串文字的一部分时。有一个函数可以实现这一点。可能是重复的感谢，成功了！是的，这里不需要优化Q）

var str = "u0393u03a5u039du0391u0399u039au0391";

if (str.Length > 0 && str[0] == 'u')
    str = str.Substring(1, str.Length - 1);

string chars = string.Concat(str.Split('u').Select(s => 
    Convert.ToChar(Convert.ToInt32("0x" + s,16))));