可以通过字符串查找来更改python字符串吗？_Python_String_Unicode_Encoding_Utf 8

可以通过字符串查找来更改python字符串吗？

python string unicode encoding utf-8

可以通过字符串查找来更改python字符串吗？,python,string,unicode,encoding,utf-8,Python,String,Unicode,Encoding,Utf 8,这个例子很好。我可以得到正确的unicode字符串文档 doc = open("1.html").read().strip() doc = doc.decode("utf-8","ignore") 出现错误“UnicodeDecodeError:'ascii'编解码器无法解码位置289处的字节0xe7:序号不在范围内（128）” 有人能解释吗？字符串文档可以通过字符串查找进行更改？忘了说，1.html包含中文单词。问题是，您正在将从文件中读取的字节字符串与unicode文本字符串u“char

这个例子很好。我可以得到正确的unicode字符串文档

doc = open("1.html").read().strip()
doc = doc.decode("utf-8","ignore")

出现错误“UnicodeDecodeError:'ascii'编解码器无法解码位置289处的字节0xe7:序号不在范围内（128）” 有人能解释吗？字符串文档可以通过字符串查找进行更改？

忘了说，1.html包含中文单词。

问题是，您正在将从文件中读取的字节字符串与unicode文本字符串

u“charset=utf”

和

u”charset=\“utf”进行比较“

。为了比较它们，Python必须在手动调用

decode

之前将字节字符串转换为unicode，这是使用默认ASCII编解码器完成的

解决方案是始终将字节字符串与字节字符串进行比较：

doc = open("1.html").read().strip()
if u"charset=utf" in doc or u"charset=\"utf" in doc:
    doc = doc.decode("utf-8","ignore")

问题是，您正在将从文件中读取的字节字符串与unicode文本字符串

u“charset=utf”

和

u“charset=\”utf进行比较“

。为了比较它们，Python必须在手动调用

decode

之前将字节字符串转换为unicode，这是使用默认ASCII编解码器完成的

解决方案是始终将字节字符串与字节字符串进行比较：

doc = open("1.html").read().strip()
if u"charset=utf" in doc or u"charset=\"utf" in doc:
    doc = doc.decode("utf-8","ignore")