Python 为什么在unicode字符串上使用difflib后我得到了KeyError_Python_Python 2.7_Dictionary_Unicode_Difflib

Python 为什么在unicode字符串上使用difflib后我得到了KeyError

python python-2.7 dictionary unicode

Python 为什么在unicode字符串上使用difflib后我得到了KeyError,python,python-2.7,dictionary,unicode,difflib,Python,Python 2.7,Dictionary,Unicode,Difflib,我尝试使用difflib来比较单词和句子（在本例中类似于dictionary），当我尝试将difflib输出与dictionary中的键进行比较时，我得到了keyrerror。有人能解释一下为什么会发生这种情况吗？当我不使用difflib时，一切正常 # -*- coding: utf-8 -*- from __future__ import unicode_literals import difflib import operator lst = ['król'] word = 'król'

我尝试使用difflib来比较单词和句子（在本例中类似于dictionary），当我尝试将difflib输出与dictionary中的键进行比较时，我得到了keyrerror。有人能解释一下为什么会发生这种情况吗？当我不使用difflib时，一切正常

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import difflib
import operator

lst = ['król']
word = 'król'

dct = {}
for order in lst:
    word_match_ratio = difflib.SequenceMatcher(None, word, order).ratio()

    dct[order] = word_match_ratio
    print order
    print('%s %s' % (order, word_match_ratio))


sorted_matching_words = sorted(dct.items(), key=operator.itemgetter(1))
sorted_matching_words = str(sorted_matching_words.pop()[:1])
x = len(sorted_matching_words) - 3
word = sorted_matching_words[3:x]

print word


def translate(someword):
    someword = trans_dct[someword]
    print(someword)
    return someword

trans_dct = {
    "król": 'king'
}
print trans_dct
word = translate(word)

预期产出：金

相反，我得到的是：

Traceback (most recent call last):
  File "D:/Python/Testing stuff.py", line 64, in <module>
    word = translate(word)
  File "D:/Python/Playground/Testing stuff.py", line 56, in translate
    someword = trans_dct[someword]
KeyError: 'kr\\xf3l'

一切正常。

问题不在于

difflib

，而在于提取

word

：

sorted_matching_words = sorted(dct.items(), key=operator.itemgetter(1))
# sorted_matching_words = (u'kr\xf3l',)

sorted_matching_words = str(sorted_matching_words.pop()[:1])
# sorted_matching_words = "(u'kr\\xf3l',)"

x = len(sorted_matching_words) - 3
word = sorted_matching_words[3:x]
# word = 'kr\\xf3l'

您不应该转换

已排序的匹配词，因为它是一个元组。每个元组元素都使用\uu repr\uuu
方法转换为字符串，这就是它转义\
的原因。您只需获取第一个元组元素：
In [34]: translate(sorted_matching_words[-1][0])
king
Out[34]: u'king'

问题不在于difflib
，而在于提取word
：
sorted_matching_words = sorted(dct.items(), key=operator.itemgetter(1))
# sorted_matching_words = (u'kr\xf3l',)

sorted_matching_words = str(sorted_matching_words.pop()[:1])
# sorted_matching_words = "(u'kr\\xf3l',)"

x = len(sorted_matching_words) - 3
word = sorted_matching_words[3:x]
# word = 'kr\\xf3l'

您不应该转换已排序的匹配词，因为它是一个元组。每个元组元素都使用\uu repr\uuu
方法转换为字符串，这就是它转义\
的原因。您只需获取第一个元组元素：
In [34]: translate(sorted_matching_words[-1][0])
king
Out[34]: u'king'

是否可能需要将u'…'
添加到unicode字符串的开头@MarkTolonenassert repr（'kr\xf3l'）=='kr\\xf3l'
@tadhgmdonald-Jensen，不，OP的技巧是从排序的匹配词中提取单词str（）
在那里做的事情是错误的。您是否需要将u'…'
添加到unicode str的开头@MarkTolonenassert repr（'kr\xf3l'）=='kr\\xf3l'
@tadhgmdonald-Jensen，不，OP的技巧是从排序的匹配词中提取单词str（）
是错误的做法。具体更改sorted\u matching\u words=str（sorted\u matching\u words.pop（）[：1]）
和接下来的两行，只是word=sorted\u matching\u words.pop（）
而不是切断元组的括号。具体更改sorted\u matching\u words=str（sorted_matching_words.pop（）[：1]）
和接下来的两行，只需word=sorted_matching_words.pop（）
，而不是切断元组的括号。