Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/entity-framework/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
被python搞糊涂了';s unicode正则表达式错误_Python_Unicode - Fatal编程技术网

被python搞糊涂了';s unicode正则表达式错误

被python搞糊涂了';s unicode正则表达式错误,python,unicode,Python,Unicode,有人能解释一下为什么Python2.7x中的中间代码摘录会抛出错误吗 import re walden = "Waldenström" walden print(walden) s1 = "ö" s2 = "Wal" s3 = "OOOOO" out = re.sub(s1, s3, walden) print(out) out = re.sub("W", "w", walden) print(out) # I need this one to work out = re.sub('W'

有人能解释一下为什么Python2.7x中的中间代码摘录会抛出错误吗

import re
walden = "Waldenström"
walden
print(walden)

s1 = "ö"
s2 = "Wal"
s3 = "OOOOO"

out = re.sub(s1, s3, walden)
print(out)

out = re.sub("W", "w", walden)
print(out)

# I need this one to work
out = re.sub('W', u'w', walden)
# ERROR

out = re.sub(u'W', 'w', walden)
print(out)

out = re.sub(s2, s1, walden)
print(out)

我很困惑,已经试着阅读了手册《瓦尔登湖》是一个
str

walden = "Waldenström"
此代码将字符替换为
unicode
字符串:

re.sub('W', u'w', walden)
结果应该是
u'w'+“aldenström”
。这是失败的部分

为了连接
str
unicode
,必须首先将两者转换为
unicode
。结果也是
unicode

问题是,解释器不知道如何将
'ö'
转换为unicode,因为它不知道使用哪种编码。结果是模棱两可的

解决方案是在进行更换之前进行自我转换:

re.sub('W', u'w', unicode(walden, encoding))
编码应该是您用来创建该文件的编码,例如

re.sub('W', u'w', unicode(walden, 'utf-8'))

非常感谢。这澄清了很多,现在文档也更有意义了!