unicode，而不是python中的str_Python_Python 2.7

unicode，而不是python中的str

python python-2.7

unicode，而不是python中的str,python,python-2.7,Python,Python 2.7,我尝试运行以下Python代码： with io.open(outfile, 'w' ) as processed_text, io.open(infile, 'r') as fin: for line in fin: processed_text.write(preprocess(line.rstrip())+'\n') 但是得到了TypeError:必须是unicode，而不是str 我怎样才能解决这个问题？我在这里搜索了类似的问题，找到了一个可以尝试的问题 wit

我尝试运行以下Python代码：

with io.open(outfile, 'w' ) as processed_text, io.open(infile, 'r') as fin:
    for line in fin:
        processed_text.write(preprocess(line.rstrip())+'\n')

但是得到了

TypeError:必须是unicode，而不是str

我怎样才能解决这个问题？我在这里搜索了类似的问题，找到了一个可以尝试的问题

with io.open(outfile, 'w', encoding="utf-8") as processed_text, io.open(infile, 'r') as fin:

但是不起作用。

试着在处理过的字符串前面写u，例如[u'blah']

由于此模块主要是为Python3.x设计的，因此您必须注意，本文档中“bytes”的所有用法都是指str类型（其中bytes是别名），而“text”的所有用法都是指unicode类型。此外，这两种类型在io API中是不可互换的

尝试将此项放在文件的最顶端：

from __future__ import unicode_literals

Python3.x默认使用unicode。这将导致Python2.x遵循相同的行为

如果仍然存在问题，可以手动转换问题字符串

uni_string = unicode(my_string)

尝试将此项放在文件的最顶端：

from __future__ import unicode_literals

Python3.x默认使用unicode。这将导致Python2.x遵循相同的行为

如果仍然存在问题，可以手动转换问题字符串

uni_string = unicode(my_string)

使用

io.open

打开文件时，请确保写入

unicode

字符串。像这样的事情应该可以做到：

with io.open(outfile, 'w' ) as processed_text, io.open(infile, 'r') as fin:
    for line in fin:
        s = preprocess(line.rstrip())
        if isinstance(s, str):
            s = s.decode('utf8')
        processed_text.write(s + u'\n')

或者修改

preprocess

以确保它返回

unicode

字符串。

确保在使用

io打开文件时写入unicode
字符串。open

。像这样的事情应该可以做到：

with io.open(outfile, 'w' ) as processed_text, io.open(infile, 'r') as fin:
    for line in fin:
        s = preprocess(line.rstrip())
        if isinstance(s, str):
            s = s.decode('utf8')
        processed_text.write(s + u'\n')

或者修改

preprocess

以确保它返回

unicode

字符串。

谢谢您的回复，请问我在哪里可以写u？在文件中我想处理它？谢谢你的回复，请问我在哪里可以写你？在我想处理的文件中？请用一个文件示例（outfile和Infle）编辑你的问题，看起来你的

preprocess

函数返回一个

str

，而不是

unicode

。我用preprocess编辑了这篇文章，仍然不清楚，

标记化

返回什么？底线是，当使用

io.open

时，您需要编写

unicode

字符串，如果

preprocess

str

，只需添加

.decode（'utf8'）

即可将其转换为unicode。请用文件示例（outfile和infle）编辑您的问题看起来您的

preprocess

函数返回的是

str

而不是

unicode

。我用preprocess编辑了这篇文章，但仍然不清楚，

标记化

返回什么？底线是，当使用

io.open

时，您需要编写

unicode

字符串，如果

preprocess

str

，只需添加

.decode（'utf8'）

即可将其转换为unicode。感谢您的回复，我尝试了您的命令，但问题仍然存在，请问您的意思是如果我升级了python版本，此错误将被解决。python2中的

str

是

bytes

，python3中的

unicode

。显然，您的

preprocess（）

对输入做了一些处理，因此它变成了

bytes

类型，因此出现了错误。您是否尝试了我的其他建议？谢谢您的回复，我尝试了您的命令，但问题仍然存在，请原谅，您的意思是如果我升级python版本，此错误将被解决。python2中的

str

是

bytes

，python3中的

unicode

。显然，您的

preprocess（）

对输入做了一些处理，因此它变成了

bytes

类型，因此出现了错误。您尝试过我的其他建议了吗？失败在哪里？找到导致问题的输入字符串并将其转换为unicode，如下所示：unicode（mystring）在哪里失败？查找导致问题的输入字符串并将其转换为unicode，如下所示：unicode（mystring）