Python 我无法读取文件，因为我收到；UnicodeDecodeError:&x27；utf-8'；编解码器可以'；t解码"；错误_Python_Utf 8

Python 我无法读取文件，因为我收到；UnicodeDecodeError:&x27；utf-8'；编解码器可以'；t解码"；错误

python utf-8

Python 我无法读取文件，因为我收到；UnicodeDecodeError:&x27；utf-8'；编解码器可以'；t解码"；错误,python,utf-8,Python,Utf 8,我有一个文件，想把它转换成utf8编码当我想阅读时，我收到以下错误： UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 947: invalid continuation byte 我的目的是读取它，然后将其转换为utf8编码格式，但它不允许读取这是我的密码： #convert all files into utf_8 format import os import io path_directory=

我有一个文件，想把它转换成utf8编码

当我想阅读时，我收到以下错误：

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 947: invalid continuation byte

我的目的是读取它，然后将其转换为utf8编码格式，但它不允许读取

这是我的密码：

#convert all files into utf_8 format
import os
import io
path_directory="some path string"
directory = os.fsencode(path_directory)
for file in os.listdir(directory):
    file_name=os.fsdecode(file)
    file_path_source=path_directory+file_name
    file_path_dest="some address to destination file"
    with open(file_path_source,"r") as f1:
        text=f1.read()
    with io.open(file_path_dest,"w+",encoding='utf8') as f2:
        f2.write(text)
    file_path=""
    file_name=""
    text=None

错误是：

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-47-59e5e52ddd40> in <module>()
     10     with open(file_path,"r") as f1:
     11         print(type(f1))
---> 12         text=f1.read()
     13     with io.open(file_path.replace("ref_sum","ref_sum_utf_8"),"w+",encoding='utf8') as f2:
     14         f2.write(text)

/home/afsharizadeh/anaconda3/lib/python3.6/codecs.py in decode(self, input, final)
    319         # decode input (taking the buffer into account)
    320         data = self.buffer + input
--> 321         (result, consumed) = self._buffer_decode(data, self.errors, final)
    322         # keep undecoded input until the next call
    323         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 947: invalid continuation byte

---------------------------------------------------------------------------
UnicodeDecodeError回溯（最近一次呼叫最后一次）
在（）
10打开（文件路径“r”）为f1:
11打印（类型（f1））
--->12 text=f1.read（）
13将io.open（文件路径替换为“ref\u sum”、“ref\u sum\u utf\u 8”）、“w+”，编码为“utf8”）作为f2：
14.书写（文本）
/解码中的home/afsharizadeh/anaconda3/lib/python3.6/codecs.py（self、input、final）
319#解码输入（考虑缓冲区）
320数据=自缓冲+输入
-->321（结果，消耗）=自身缓冲区解码（数据，自身错误，最终）
322#保留未编码的输入直到下一次调用
323 self.buffer=数据[消耗：]
UnicodeDecodeError:“utf-8”编解码器无法解码位置947中的字节0xe9:无效的连续字节

如何在不读取文件的情况下将文件转换为utf8？

这是显而易见的。如果您想打开一个文件，而不是python3的utf8（utf8是python3的默认编码，而ascii是python2的默认编码），那么您必须在打开文件时提及您知道该文件的编码：

io.open(file_path_dest,"r",encoding='ISO-8859-1')

在这种情况下，编码是ISO-8859-1，因此您必须提及它。

这经常会弹出，很难搜索，因为它的点击率太高。您告诉Python它已经是utf-8了，但事实并非如此，所以解码失败。文件是否包含utf头

#-*-编码：文件开头的utf-8-*-

。@0decimal0不，不是。请将其放在顶部，然后尝试读取文件。您是否尝试按照我建议的方式打开文件？您应该提到编码。只有在解码字节字符串时，UTF-8才是Python 3中的默认编码。但是，

open

命令使用您的区域设置，因此在Windows框中，它将是您的8位代码页。在Mac和现代Linux上，可能是UTF-8