替换Python中的非ASCII字符：例如，'；与’；_Python_Python 2.7_Unicode

替换Python中的非ASCII字符：例如，'；与’；

python python-2.7 unicode

替换Python中的非ASCII字符：例如，'；与’；,python,python-2.7,unicode,Python,Python 2.7,Unicode,我不想把你简化为你（不是你）。这就是我正在做的： >>> clean = "you'll" >>> import string >>> clean = filter(lambda x: x in string.printable, clean) >>> print clean you'll >>> clean = "you’ll" >>> clean = filter(lambda x

我不想把

你

简化为

你

（不是

你

）。这就是我正在做的：

>>> clean = "you'll"
>>> import string
>>> clean = filter(lambda x: x in string.printable, clean)
>>> print clean
you'll

>>> clean = "you’ll" 
>>> clean = filter(lambda x: x in string.printable, clean)
>>> print clean
youll

这就是我所尝试的：

>>> clean = "you'll"
>>> clean =clean.replace('\'',' ')
>>> print clean
you ll
>>> clean = "you’ll"
>>> clean =clean.replace('’',' ')
>>> print clean
you ll

这很好，但当我把它放在脚本中时：

SyntaxError: Non-ASCII character '\xe2' in file sc.py on line 177, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

# -*- coding: utf-8 -*-

因此，我在脚本的最顶端添加了：

SyntaxError: Non-ASCII character '\xe2' in file sc.py on line 177, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

# -*- coding: utf-8 -*-

但是得到

clean =clean.replace('’',' ')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)

我有点不知所措。

您可以使用

replace（）

将撇号替换为空格，如下所示：

print "you'll".replace("'", " ")

打印

您将

您可以使用

replace（）

将撇号替换为空格，如下所示：

print "you'll".replace("'", " ")

打印

您将

这可能不是最好的答案，但一个简单的解决方案是只处理异常：

clean2 = ""
for ch in clean:
    try:
        clean2 += " " if ch == "'" else clean2 += ch
    except UnicodeDecodeError:
        clean2 += 'vs.'

这可能不是最好的答案，但一个简单的解决方案是只处理异常：

clean2 = ""
for ch in clean:
    try:
        clean2 += " " if ch == "'" else clean2 += ch
    except UnicodeDecodeError:
        clean2 += 'vs.'

您需要对字符串进行解码

# -*- coding: utf-8 -*- 
clean = "you’ll".decode('utf-8')
clean = clean.replace('’'.decode('utf-8'),' ')
print clean

此

打印

you ll

正如预期的那样

您需要对字符串进行

解码

# -*- coding: utf-8 -*- 
clean = "you’ll".decode('utf-8')
clean = clean.replace('’'.decode('utf-8'),' ')
print clean

此

打印

you ll

正如预期的那样

2.7.6（默认值，2014年9月9日，15:04:36）仍然得到

UnicodeDecodeError:“ascii”编解码器无法解码0:ordinal位置的字节0xe2，不在范围（128）内

它可以从终端工作（如图所示），但在我的脚本中，它不工作！你的问题让我意识到我的系统一团糟，我需要20分钟来搜索一个简单的文件！是啊幸运的是，我没有删除它2.7.6（默认，2014年9月9日，15:04:36）仍然得到

UnicodeDecodeError:“ascii”编解码器无法解码位置0:ordinal不在范围（128）中的字节0xe2

它可以从终端工作（如图所示），但在我的脚本中它不工作！你的问题让我意识到我的系统一团糟，我需要20分钟来搜索一个简单的文件！是啊幸运的是我没有删除它为什么？这样一个绕路线的人，兄弟？但也是一个聪明的人！为什么？这样一个绕路线的人，兄弟？但也是一个聪明的人！