什么是';charmap';以python显示的编解码器';什么是独角兽?

什么是';charmap';以python显示的编解码器';什么是独角兽?,python,python-3.x,character-encoding,decode,encode,Python,Python 3.x,Character Encoding,Decode,Encode,运行一些简单代码,查看哪些编码可以解码特定文件,如中所示: encodings = ('cp737', 'cp869', 'cp875', 'cp1253', 'iso2022_jp_2', 'iso8859_7', 'mac_greek', 'utf-8') def test_encoding(): with tempfile.TemporaryDirectory() as tmp_dir: for c in csvs:

运行一些简单代码,查看哪些编码可以解码特定文件,如中所示:

encodings = ('cp737', 'cp869', 'cp875', 'cp1253', 'iso2022_jp_2', 'iso8859_7',
             'mac_greek', 'utf-8')

def test_encoding():
    with tempfile.TemporaryDirectory() as tmp_dir:
        for c in csvs:
            for encoding in encodings:
                try:
                    with open(c, 'r', encoding=encoding) as f:
                        content = f.read()
                except UnicodeDecodeError as e:
                    print(encoding, e) # <---- print from here
                    continue
                csv_out = os.path.join(tmp_dir, os.path.basename(
                    c[:-4]) + '_%s.csv' % encoding)
                with open(csv_out, 'w', encoding=encoding,
                          newline='\n') as f:
                    f.write(content)
        input('Files created in %s' % tmp_dir)

可能是因为某些编码是1对1映射(正常的单字节编码),而其他编码是多字节的吗?(或者可能不是。一个反例:iso8859_7)是的,但反例是这样的——那么这个charmap编解码器是什么呢?试着搜索一下,你会发现数以百万计的
“charmap”编解码器无法解码YY位置的字节0xXX
那么iso8859_7也是一个字节@usr2564301-那么charmap代表一个字节的编解码器?无论如何,如果python在这些错误中打印出编解码器的名称,这不是更有帮助吗?请检查python源代码对该特定行的作用。(在
exceptions.c
中)-在那里找不到charmap,它们使用某种反射技术。无论如何,我想知道为什么不打印编码的名称
cp869编解码器无法解码字节…
而不是
“charmap”编解码器无法解码字节…
。有关于“charmap”编解码器的文档吗?
cp869 'charmap' codec can't decode byte 0x83 in position 28: character maps to <undefined>
cp1253 'charmap' codec can't decode byte 0x8c in position 26: character maps to <undefined>
iso2022_jp_2 'iso2022_jp_2' codec can't decode byte 0xce in position 18: illegal multibyte sequence
iso8859_7 'charmap' codec can't decode byte 0xae in position 84: character maps to <undefined>
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 07:18:10) [MSC v.1900 32 bit (Intel)]