Python 2.7 输入中不支持的字符（Python 2.7.9）_Python 2.7_Utf 8

Python 2.7 输入中不支持的字符（Python 2.7.9）

python-2.7 utf-8

Python 2.7 输入中不支持的字符（Python 2.7.9）,python-2.7,utf-8,Python 2.7,Utf 8,一个新手的小问题。我试着做一个小功能，它随机的文本内容 #-*- coding: utf-8 -*- import random def glitch(text): new_text = [''] for x in text: new_text.append(x) random.shuffle(new_text) return ''.join(new_text) 正如你所看到的，它非常简单，当输入一个简单的字符串，比如“你好吗？”时，输

一个新手的小问题。我试着做一个小功能，它随机的文本内容

#-*- coding: utf-8 -*-
import random

def glitch(text):
    new_text = ['']
    for x in text:
        new_text.append(x)
        random.shuffle(new_text)
    return ''.join(new_text)

正如你所看到的，它非常简单，当输入一个简单的字符串，比如“你好吗？”时，输出将产生一个随机的句子。但是，当我尝试粘贴类似的内容时：

打印故障（'Iáäï�†新西兰元§&0ñŒ≥Qùùùùùùùùùùùùùùù

…Python2.7.9返回“输入中不支持的字符”——我已经在论坛上浏览了一下，并尝试了一些我能理解的东西，因为我对一般的编码还是新手，但没有用

有什么建议吗

谢谢

#-*- coding: utf-8 -*-
import random

def glitch(text):

    new_text = ['']
    for x in text:
        new_text.append(x)
        random.shuffle(new_text)
    return ''.join(new_text)

print (glitch(u'Iàäï†n$§&0ñŒ≥Q¶µù`o¢y”—œº'))

通过我自己的谷歌快速搜索，我发现，这应该是可行的，你必须在字母“u”前面加上前缀，才能将下面的文本标记为unicode

来源：

您的问题是Python2.x，而不是Python2的特定版本。Python2.x使用

ascii

而不是Unicode编码（在Python3中更改），并且字符串（likley）编码为

utf-8

。见下文：

import chardet
text = 'Iàäï†n$§&0ñŒ≥Q¶µù`o¢y”—œº'
print chardet.detect(text)['encoding'] # prints utf-8

如果您下载Python3.X，您的问题可能会得到解决

如果您感兴趣——或者对未来的2.x用户感兴趣——您可以执行以下操作

def glitch(text):
    new_text = []
    for x in text:
        new_text.append(x)
    random.shuffle(new_text) #note you should just shuffle once - not every iteration.
    new_line = ''.join(new_text) # this line is where your encoding moves from `utf-8` to `ascii`
    # this becomes `ascii` because of the empty string you use to join your list.  it defaults to `ascii`
    # if you tried to make it `unicode` by doing `u''.join(list)` you would get a `UnicodeDecodeError`
    return new_line.decode("ascii", "ignore").encode("utf-8") # note the [ignore][2].  it bypasses encoding errors.
    # now your code will run and return a string of utf-8 characters 
    # (to which we encode new_line, and which is the default encoding of a string anytime you `decode()` it.)
    # note that you will return a shorter string, because (again) `ascii` can only represent 
    # 128 characters by default, whereas some of your `utf-8` string is represented by 
    # characters b/w 129 & 255.

我希望这有帮助，也有意义。在线上有很多资料讨论这个问题（包括我自己的多个问题——：）

在2.7.5中对我很好，无论是从脚本中打印还是导入控制台后。是否可能我缺少一些首选项，或者可能必须下载一些包才能使用这些输入？我在Mac OSX 10.10.1上--我已经多次尝试更改三个选项（区域设置、utf-8、无）的首选项，但似乎没有任何效果。谢谢，但u-infront似乎没有任何区别。

print chardet.detect（text）['encoding']

显示“utf-8”，因为您的文本字符串已保存并编码为utf-8。如果我对文件进行不同的编码，我可以制作此节目

windows-1252

。我很少会投反对票，因为你的答案对其他用户来说太误导了。Python2.x在Unicode支持方面也同样强大。您的

glitch（）

方法包含许多谬误，这将阻止用户正确理解Unicode支持