如何在Python3中像在Python2中一样将chr(0xdfff)转换为utf-8字节?

如何在Python3中像在Python2中一样将chr(0xdfff)转换为utf-8字节?,python,python-2.7,python-3.x,encoding,Python,Python 2.7,Python 3.x,Encoding,下面的代码解释了我的问题。它在Python2.7中可以正常工作,但是,我在Python3.5中尝试的所有编码调用都失败了(请参见下面的异常)。。。有没有人想办法绕过这个错误,让它像在Python2.7上那样在Python3.5中工作 import sys if sys.version_info[0] <= 2: chr = unichr out = chr(0xdfff) print(repr(out)) # outputs '\udfff' both in Python 2

下面的代码解释了我的问题。它在Python2.7中可以正常工作,但是,我在Python3.5中尝试的所有编码调用都失败了(请参见下面的异常)。。。有没有人想办法绕过这个错误,让它像在Python2.7上那样在Python3.5中工作

import sys

if sys.version_info[0] <= 2:
    chr = unichr

out = chr(0xdfff)
print(repr(out)) # outputs '\udfff' both in Python 2 and 3
assert out.encode('utf-8').decode('utf-8') == out
assert out.encode('utf-8', errors='surrogateescape').decode('utf-8') == out
assert out.encode('utf-8', errors='strict').decode('utf-8') == out
导入系统
如果sys.version_info[0]则问题在于字符属于utf-16:

import sys

if sys.version_info[0] <= 2:
    chr = unichr

out = chr(0xdfff)

print(out.encode('utf-16-le', 'ignore').decode('utf-16-le', 'ignore') == out)
print(out.encode('utf-16-le', 'ignore').decode('utf-16-le', 'ignore') == out)
print(out.encode('utf-16-le', 'ignore').decode('utf-16-le', 'ignore') == out)
导入系统
如果sys.version_info[0]在搜索多一点之后,我注意到指向一个
代理类
,它是特定于
utf-X
编解码器的,因此,使用
代理类
而不是
代理类
似乎可以完成这个技巧,并在Python 3上正常工作:

assert out.encode('utf-8', errors='surrogatepass'
    ).decode('utf-8', errors='surrogatepass') == out

你为什么希望它能工作?@JoshLee,因为它在python2上工作;)实际上,我找到了一种更好的方法,在编码/解码时使用
subrogatepass
assert out.encode('utf-8', errors='surrogatepass'
    ).decode('utf-8', errors='surrogatepass') == out