为什么python不能正确地替换数据字典中的单词？_Python

为什么python不能正确地替换数据字典中的单词？

python

为什么python不能正确地替换数据字典中的单词？,python,Python,在我的代码中，我将句子中的sin替换为there位置。它可以工作，但不能正确地更换。这是输出 firstsentence=("an eye for an eye a tooth for a tooth") def replace_all(firstsentence, stuff): for i, j in stuff.items(): firstsentence = firstsentence.replace(i, j) return firstsentence

在我的代码中，我将句子中的sin替换为there位置。它可以工作，但不能正确地更换。这是输出

firstsentence=("an eye for an eye a tooth for a tooth")

def replace_all(firstsentence, stuff):
    for i, j in stuff.items():
        firstsentence = firstsentence.replace(i, j)
    return firstsentence
stuff = {"a": "1", "eye": "2", "for":"3", "tooth": "5", "an": "6"}
test=replace_all(firstsentence, stuff)
list(firstsentence)
list(test)




appendFile=open("task2.txt", "a")
appendFile.write(firstsentence+"\n")
appendFile.write(test+"\n")
appendFile.close()

它将“an”一词中的“a”替换为1，但忽略了一个事实，即它是一个完整的词，即“an”。为什么要这样做？

'an'.replace（'a'，1）

首先运行，为您提供
'1n'
<代码>'1n'。替换（'an'，6）不会替换
1n
按长度对替换项进行排序，以确保首先处理较长的匹配项：

an eye for an eye a tooth for a tooth 1n 2 3 1n 2 1 5 3 1 5

sorted（）
函数对
（key，value）
stuff.items（）生成的元组进行排序，并根据键的长度进行排序（向
键
lambda传递一个元组，
kv[0]
是字典键）。排序顺序颠倒，将最长的键放在第一位
这样，您可以在替换
a
的实例之前，尝试替换
a
的所有实例
演示：
注意，这不会阻止部分更换；如果文本中出现了
animal
或
fortitude
等词，那么您仍然会看到部分替换词。如果您完全只需要替换整个单词，那么您需要将您的句子拆分为空格并直接查找字典，或者使用带有
\b
单词边界锚的正则表达式。
首先运行，给您'1n' <代码>'1n'。替换（'an'，6）不会替换
1n
按长度对替换项进行排序，以确保首先处理较长的匹配项：

an eye for an eye a tooth for a tooth 1n 2 3 1n 2 1 5 3 1 5

sorted（）
函数对
（key，value）
stuff.items（）生成的元组进行排序，并根据键的长度进行排序（向
键
lambda传递一个元组，
kv[0]
是字典键）。排序顺序颠倒，将最长的键放在第一位
这样，您可以在替换
a
的实例之前，尝试替换
a
的所有实例
演示：

注意，这不会阻止部分更换；如果文本中出现了
animal
或
fortitude
等词，那么您仍然会看到部分替换词。如果您完全只需要替换整个单词，则需要将句子拆分为空格并直接查找词典，或者使用带有
\b
单词边界锚的正则表达式。
首先，您应该标记行并提取单词：

>>> def replace_all(firstsentence, stuff): ... for i, j in sorted(stuff.items(), key=lambda kv: len(kv[0]), reverse=True): ... firstsentence = firstsentence.replace(i, j) ... return firstsentence ... >>> stuff = {"a": "1", "eye": "2", "for":"3", "tooth": "5", "an": "6"} >>> firstsentence = "an eye for an eye a tooth for a tooth" >>> replace_all(firstsentence, stuff) '6 2 3 6 2 1 5 3 1 5'
然后，您应该构建集合词或唯一词（也许您应该首先降低em）：
在此之后，您可以对文本进行编码：

uwords = list(set([_.lower() for _ in wordslist]))
因此，结果函数为：

output = ' '.join([str(uwords.index(_)+1) for _ in wordslist])
或者，如果您有固定的文字映射：

def replace_all(sentence): wordslist = sentence.split(' ') uwords = list(set([_.lower() for _ in wordslist])) return ' '.join([str(uwords.index(_.lower())+1) for _ in wordslist])

首先，您应该标记行并提取单词：

>>> def replace_all(firstsentence, stuff): ... for i, j in sorted(stuff.items(), key=lambda kv: len(kv[0]), reverse=True): ... firstsentence = firstsentence.replace(i, j) ... return firstsentence ... >>> stuff = {"a": "1", "eye": "2", "for":"3", "tooth": "5", "an": "6"} >>> firstsentence = "an eye for an eye a tooth for a tooth" >>> replace_all(firstsentence, stuff) '6 2 3 6 2 1 5 3 1 5'
然后，您应该构建集合词或唯一词（也许您应该首先降低em）：
在此之后，您可以对文本进行编码：

uwords = list(set([_.lower() for _ in wordslist]))
因此，结果函数为：

output = ' '.join([str(uwords.index(_)+1) for _ in wordslist])
或者，如果您有固定的文字映射：

def replace_all(sentence): wordslist = sentence.split(' ') uwords = list(set([_.lower() for _ in wordslist])) return ' '.join([str(uwords.index(_.lower())+1) for _ in wordslist])

为什么调用
list（第一句）
和
list（test）
，却忽略结果？这两个调用都可以完全删除而不会发生意外。为什么要调用
list（first句子）
和
list（test）
，而忽略结果？这两个调用都可以完全删除而不会发生意外。添加
'an'.replace（'a'，1）
并不总是首先运行可能会很有用。由于散列参数随每次运行而变化，程序返回正确输出的变化约为50%。添加
“an”可能会很有用。replace（'a'，1）
并不总是首先运行。由于散列参数随每次运行而变化，程序返回正确输出的变化约为50%。