Javascript 为什么Buffer.from(';\x80';,';utf8';)返回<;缓冲器c2 80>;
为什么会发生这种情况Javascript 为什么Buffer.from(';\x80';,';utf8';)返回<;缓冲器c2 80>;,javascript,node.js,Javascript,Node.js,为什么会发生这种情况 > Buffer.from('\x79', 'utf8') <Buffer 79> > Buffer.from('\x80', 'utf8') <Buffer c2 80> >Buffer.from('\x79',utf8') >Buffer.from('\x80',utf8') 如何让缓冲区表现出我期望的行为,并返回?这是因为0x80或1000 0000二进制或128十进制在UTF-8中不是有效的码点,因为它在ASCII之外(即7位
> Buffer.from('\x79', 'utf8')
<Buffer 79>
> Buffer.from('\x80', 'utf8')
<Buffer c2 80>
>Buffer.from('\x79',utf8')
>Buffer.from('\x80',utf8')
如何让
缓冲区表现出我期望的行为,并返回
?这是因为0x80
或1000 0000
二进制或128十进制在UTF-8中不是有效的码点,因为它在ASCII之外(即7位,所以所有ASCII码点的第一位都设置为0
). 要将字符串转换为Buffer
s而不将其解释为UTF-8,可以使用'ascii'
编码:
> Buffer.from('\x79', 'ascii')
<Buffer 79>
> Buffer.from('\x80', 'ascii')
<Buffer 80>
> Buffer.from('Even tho @Boris's answer does explain the behaviour, I just wanted to point out that:
As from the documentation:
- The
'ascii'
encoding is considered a legacy character encoding
As another solution, using the latin1
encoding, should give the desired result:
- 'latin1': Latin-1 stands for ISO-8859-1.
This character encoding only supports the Unicode characters from U+0000 to U+00FF. Each character is encoded using a single byte. Characters that do not fit into that range are truncated and will be mapped to characters in that range.
const a = Buffer.from('\x80', 'utf8');
const b = Buffer.from('\x80', 'latin1')
console.log(a, b);
// <Buffer c2 80> <Buffer 80>
>Buffer.from('\x79',ascii')
>Buffer.from('\x80',ascii')
>Buffer.from(“即使tho的回答也解释了这种行为,我只想指出:
自以下日期起:
'ascii'
编码被认为是传统的字符编码
作为另一种解决方案,使用latin1
编码应该会得到所需的结果:
- “latin1”:Latin-1代表
此字符编码仅支持从U+0000到U+00FF的Unicode字符。每个字符使用单个字节编码。不适合该范围的字符将被截断,并将映射到该范围内的字符
const a=Buffer.from('\x80',utf8');
常量b=Buffer.from('\x80',latin1')
控制台日志(a,b);
//
在我的书中,每一种非UTF8编码都是一种传统编码。