Python 用于更新字典值但不影响输出的循环
我试图在一个文件中迭代句子,选择“最好”的句子(即具有最多罕见双音(发音)的句子),然后在选择一个句子后,将句子中每个双音的字典值更改为0,这样就不会再次选择双音了(因为我想确保所有可能的双音都被选中) 我已经为此编写了代码,但不明白为什么它不会影响输出,因为当我检查在for循环开始时选取的其中一个字典键的值时,它被设置为0。我的代码是:Python 用于更新字典值但不影响输出的循环,python,Python,我试图在一个文件中迭代句子,选择“最好”的句子(即具有最多罕见双音(发音)的句子),然后在选择一个句子后,将句子中每个双音的字典值更改为0,这样就不会再次选择双音了(因为我想确保所有可能的双音都被选中) 我已经为此编写了代码,但不明白为什么它不会影响输出,因为当我检查在for循环开始时选取的其中一个字典键的值时,它被设置为0。我的代码是: diphone_frequencies = {...} diphone_frequencies_original = copy.deepcopy(diphon
diphone_frequencies = {...}
diphone_frequencies_original = copy.deepcopy(diphone_frequencies)
line_score = {}
best_utts = []
for i in range(650):
# Open the file and put all its lines in a list. Once.
with open('recipe_diphone_utts.txt') as file:
# Read the file lines one by one.
for line in file:
line = line.rstrip('\r\n')
if line in best_utts:
continue # Skip previously picked sentences.
score = 0.0
# Add a score to the line depending on its content.
for word in line.split():
score += float(diphone_frequencies[word])
line_score[line] = score/len(line.split())
# Sort each lines based on their score and get the best.
best_sentence = max(line_score.keys(), key=(lambda k: line_score[k]))
best_utts.append(best_sentence)
print(best_sentence)
# Each unique word of this iteration's best sentence has its score set to 0.
for item in set(best_sentence.split()):
diphone_frequencies[item] = 0
if all(value == 0 for value in diphone_frequencies.values()):
diphone_frequencies = diphone_frequencies_original
编辑:这已经解决了,但是我现在不能接受我自己的答案;问题是在打开文档后有for循环;当我放置
for i in range(600):
以前
with open('recipe_diphone_utts.txt') as file:
编辑2:
所面临的主要问题已经解决,我已经更改了代码,但是行:
if line in best_utts:
continue
应该确保在重置字典值后不会再次拾取同一行的多个实例,但是这会导致同一个句子被一次又一次地拾取为最佳句子,因此我需要一些其他方法来防止同一个句子被多次拾取。当前
best_utts==[best_-sense]*600
因为外循环,与文件的所有其他句子(行)相比,最佳句子
是得分最高的句子
为了得到600个最好的句子,我会这样说:
diphone_frequencies = {...}
diphone_frequencies_original = copy.deepcopy(diphone_frequencies)
line_score = {}
best_utts = []
# Open the file and put all its lines in a list. Once.
with open('recipe_diphone_utts.txt') as file:
all_lines = file.readlines()
for i in range(600):
print(diphone_frequencies['f_@@r'])
# Read the file lines one by one.
for line in all_lines:
line = line.rstrip()
if line in best_utts:
line_score[line] = 0
continue # Skip previously picked sentences.
score = 0.0
# Add a score to the line depending on its content.
for word in line.split():
score += float(diphone_frequencies[word])
line_score[line] = score/len(line.split())
# Sort each lines based on their score and get the best.
best_sentence = max(line_score.keys(), key=(lambda k: line_score[k]))
best_utts.append(best_sentence)
# Each unique word of this iteration's best sentence has its score set to 0.
for item in set(best_sentence.split()):
diphone_frequencies[item] = 0
if all(value == 0 for value in diphone_frequencies.values()):
diphone_frequencies = diphone_frequencies_original
print(best_utts)
另外,结尾不需要使用
file.close()
,因为您将与open…as file一起使用,而不是file=open(…)
我发现我犯的主要错误是
for i in range(600):
之后
with open('recipe_diphone_utts.txt') as file
当我把它改为在for循环中使用with open…时,它工作了。是的,什么是双音频率
?该变量不是initialisedSorry,忘了包括它,因为我在字典和我正在使用的代码之间有一些旧代码,在这里添加了它您可以删除file.close()
如同使用打开的一样
会在文件作用域结束时自动关闭文件。您在外部循环的第一次迭代中读取整个文件。这似乎没有多大意义。您建议我改为做什么?我不完全确定您的意思感谢您的响应!但事实上,底部代码没有影响for循环是我最努力解决的问题,因为它包含用于查找句子的加权代码,并且必须在选择每个句子后运行。你知道我如何更改代码以实现此目的吗?是的,我将编辑答案以使其工作。我在其他注释中的理解正确吗?是的,更改措辞从每个句子到0Ok的所选单词的四元值我已经更改了内容,并在末尾弹出了一个for循环,因为您可以使用set()
直接在列表上。为什么在所有单词用完后需要重置频率?我们可以在那里停止循环。结果也不起作用,所以我制作了一个由交叉点()组成的集合数据库中的文件和所选的句子从@Guimoute查看更新的答案。他一次读取整个文件,然后在内容上循环600次。