Python UnicodeEncodeError:&x27;ascii';编解码器可以';t编码字符u'\ufe0f';,压缩函数

Python UnicodeEncodeError:&x27;ascii';编解码器可以';t编码字符u'\ufe0f';,压缩函数,python,python-3.x,python-2.7,Python,Python 3.x,Python 2.7,我有以下代码,它适用于Python3.5,但是当我尝试使用Python2.7运行它时,它显示了一个错误 代码如下: def load_data_and_labels(): # Load data from files with codecs.open('./data/train.txt',encoding="utf8") as inf: reader = csv.reader(inf, delimiter='\t',quoting=csv.QUOTE_NONE)

我有以下代码,它适用于Python3.5,但是当我尝试使用Python2.7运行它时,它显示了一个错误

代码如下:

def load_data_and_labels():
    # Load data from files
    with codecs.open('./data/train.txt',encoding="utf8") as inf:
        reader = csv.reader(inf, delimiter='\t',quoting=csv.QUOTE_NONE)
        col = list(zip(*reader)) # <--- The error appeared here.
        x_text = col[2]
        colY = col[1]
    # Split by words
    x_text = [clean_str(sent) for sent in x_text]
    x_text = [s.split(" ") for s in x_text]
    # Generate labels
    y = [[1,0] if int(x)==1 else [0,1] for x in colY]
    y = np.array(y)
    return [x_text, y]

如果您花时间简单搜索一下Python2和Python3之间的差异,您将看到最大的变化之一是unicode支持,因为在Python3中,字符串默认为unicode

因此,如果您有一个包含unicode字符的文件,并且尝试在Python2中获取这些字符的表示形式,而无需特别注意,那么它将失败,因为默认值将转换为标准ascii

如果您将这一点与(引用CSV文件读取器模块的文档)
CSV模块不直接支持读取和写入Unicode的事实结合起来,您就会明白为什么这不起作用


您可以在此处查看:

该版本的
csv
模块不支持Unicode输入,请参阅此处的说明:

3   1   Hey there! Nice to see you Minnesota/ND Winter Weather 
4   0   3 episodes left I'm dying over here
5   1   "I can't breathe!" was chosen as the most notable quote of the year