在Python中使用unicode（）和encode（）函数_Python_String_Sqlite_Unicode_Encoding

在Python中使用unicode（）和encode（）函数

python string sqlite unicode encoding

在Python中使用unicode（）和encode（）函数,python,string,sqlite,unicode,encoding,Python,String,Sqlite,Unicode,Encoding,我在编码path变量并将其插入SQLite数据库时遇到问题。我试着用encode（“utf-8”）函数来解决这个问题，但没有用。然后我使用了unicode（）函数，它给了我unicode类型我忘了更改fullFilePath变量的编码，它也有同样的问题，但我现在很困惑。我应该只使用unicode（）或编码（“utf-8”）还是两者都使用我不能用 fullFilePath = unicode(fullFilePath.encode("utf-8")) 因为它会引发此错误： UnicodeDe

我在编码path变量并将其插入SQLite数据库时遇到问题。我试着用encode（“utf-8”）函数来解决这个问题，但没有用。然后我使用了unicode（）函数，它给了我unicode类型

我忘了更改fullFilePath变量的编码，它也有同样的问题，但我现在很困惑。我应该只使用unicode（）或编码（“utf-8”）还是两者都使用

我不能用

fullFilePath = unicode(fullFilePath.encode("utf-8"))

因为它会引发此错误：

UnicodeDecodeError:“ascii”编解码器无法解码位置中的字节0xc5 32:序号不在范围内（128）

Python版本是2.7.2，str是以字节为单位的文本表示，

unicode

是以字符为单位的文本表示

您将文本从字节解码为unicode，并使用某种编码将unicode编码为字节

即:

>>> 'abc'.decode('utf-8')  # str to unicode
u'abc'
>>> u'abc'.encode('utf-8') # unicode to str
'abc'

UPD 2020年9月：答案是在主要使用Python 2时编写的。在Python3中，

str

被重命名为

bytes

，而

unicode

被重命名为

str

>>> b'abc'.decode('utf-8') # bytes to str
'abc'
>>> 'abc'.encode('utf-8'). # str to bytes
b'abc'

您不正确地使用了

encode（“utf-8”）

。Python字节字符串（

str

type）有编码，Unicode没有。您可以使用

uni.encode（encoding）

将Unicode字符串转换为Python字节字符串，也可以使用

s.decode（encoding）

将字节字符串转换为Unicode字符串（或等效地使用

Unicode（s，encoding）

）

如果

fullFilePath

和

path

当前是

str

类型，您应该了解它们是如何编码的。例如，如果当前编码为utf-8，您将使用：

path = path.decode('utf-8')
fullFilePath = fullFilePath.decode('utf-8')

如果这不能解决问题，实际问题可能是在

execute（）

调用中没有使用Unicode字符串，请尝试将其更改为以下内容：

cur.execute(u"update docs set path = :fullFilePath where path = :path", locals())

在从shell运行脚本之前，请确保已正确设置了区域设置，例如

$ locale -a | grep "^en_.\+UTF-8"
en_GB.UTF-8
en_US.UTF-8
$ export LC_ALL=en_GB.UTF-8
$ export LANG=en_GB.UTF-8

文档：

man-locale

，

man-setlocale

引发错误的代码在哪里？您的确切问题已经得到了回答：[[1]：@newtover我编辑了这个问题。你把两个使用过的变量都转换成了unicode了吗？学习Python 3的文本和数据是如何帮助我理解一切的。然后就可以很容易地将这些知识应用到Python 2上。这句话

fullFilePath=fullFilePath.decode（“utf-8”）

仍会引发错误

UnicodeEncodeError:“ascii”编解码器无法对位置32-34中的字符进行编码：序号不在范围内（128）

.fullFilePath是str类型和取自db表文本列的字符串的组合，应该是utf-8编码。根据，但可以是utf-8、utf-16BE或utf-16LE。我能找到它吗？@xraf，如果你在组合不同的

str

对象，你可能在混合编码。你能显示

打印报告的结果吗（fullFilePath）

？我只能在调用decode（）之前显示它。有问题的字符是\u0161和\u0165。@xralf-那么它已经是unicode了吗？尝试将执行调用更改为unicode:

cur.execute（u“更新文档集路径=：fullFilePath其中路径=：path”，locals（））

非常好的回答，直截了当。我想补充一点，

unicode

谈到字母或符号，或者更一般地说：

str

表示特定编码中的字节字符串，您必须

解码（显然是正确的编码）要获取特定的runesPython 3.8>'str'对象没有“decode”属性您有将unicode更改为str的文档吗？我不能find@cikatomo这是Python 3中的关键更改之一：
cur.execute(u"update docs set path = :fullFilePath where path = :path", locals())

$ locale -a | grep "^en_.\+UTF-8"
en_GB.UTF-8
en_US.UTF-8
$ export LC_ALL=en_GB.UTF-8
$ export LANG=en_GB.UTF-8