为什么mysql十六进制结果与shell和java不同_Mysql_Unicode_Hex

为什么mysql十六进制结果与shell和java不同

mysql unicode

为什么mysql十六进制结果与shell和java不同,mysql,unicode,hex,Mysql,Unicode,Hex,我想得到汉字的unicode表示法，例如'京' --> 4eac Shell ➜ ~ printf "%x\n" \'京 4eac Java jshell> Integer.toHexString('京'); $14 ==> "4eac" 为什么在mysql中会有不同的结果 select hex('京'); +------------+ | hex('京') | +------------+ | E4BAAC | +------------+ show variab

我想得到汉字的unicode表示法，例如

'京' --> 4eac

Shell

➜  ~ printf "%x\n" \'京
4eac

Java

jshell> Integer.toHexString('京');
$14 ==> "4eac"

为什么在mysql中会有不同的结果

select hex('京');
+------------+
| hex('京')  |
+------------+
| E4BAAC     |
+------------+

show variables like 'char%';
+--------------------------+------------------------------------------------------+
| Variable_name            | Value                                                |
+--------------------------+------------------------------------------------------+
| character_set_client     | utf8                                                 |
| character_set_connection | utf8                                                 |
| character_set_database   | utf8                                                 |
| character_set_filesystem | binary                                               |
| character_set_results    | utf8                                                 |
| character_set_server     | utf8                                                 |
| character_set_system     | utf8                                                 |

在mysql中，它必须使用下面的方式来获得与上面相同的结果

select hex(convert('京' using ucs2));
+--------------------------------+
| hex(convert('京' using ucs2))  |
+--------------------------------+
| 4EAC                           |
+--------------------------------+

那么为什么mysql中的

hex

与其他版本不同呢

除了从unicode到字符

壳

爪哇

Mysql

UTF-8（MySQL的utf8或utf8mb4）是与UCS2（MySQL:UCS2）不同的编码

参考资料：

谢谢！所以在shell和jshell环境中，它实际上使用了

ucs2

编码而不是utf8，对吗？

➜  ~ echo '\u4eac'
京

jshell> String s = "\u4eac";
s ==> "京"

select char(0x4eac using ucs2);
+-------------------------+
| char(0x4eac using ucs2) |
+-------------------------+
| 京                      |
+-------------------------+

'京' =
Unicode "codepoint" (in hex) '4eac' =
UCS2 encoding (2 bytes, in hex) '4EAC' =
UTF-8 encoding (3 bytes, in hex) 'E4BAAC' =
html entity '&#x4EAC;' (hex) or '&#20140;' (decimal)