Python 如何停止遍历枚举列表的循环？_Python_Loops_Enumeration

Python 如何停止遍历枚举列表的循环？

python loops

Python 如何停止遍历枚举列表的循环？,python,loops,enumeration,Python,Loops,Enumeration,我已经开始使用Python中的enumerate（）函数，并希望改进我在中首次讨论的上下文脚本中的关键字由于初始脚本只检索每个关键字的第一个实例及其后续单词，因此我尝试编写一个脚本，遍历整个文件并将所有单词与关键字列表进行比较然而，发生的事情是，我得到了一个我的Jupyter笔记本无法处理的所谓无休止的结果列表。当枚举I大于分析文本文件中的字数时，我甚至尝试使用break强制停止。不幸的是，这也不起作用我想我还没有完全掌握enumerate（）函数背后的逻辑，希望您能给我一些建议这是我的

我已经开始使用Python中的

enumerate（）

函数，并希望改进我在中首次讨论的上下文脚本中的关键字

由于初始脚本只检索每个关键字的第一个实例及其后续单词，因此我尝试编写一个脚本，遍历整个文件并将所有单词与关键字列表进行比较

然而，发生的事情是，我得到了一个我的Jupyter笔记本无法处理的所谓无休止的结果列表。当枚举

大于分析文本文件中的字数时，我甚至尝试使用

break

强制停止。不幸的是，这也不起作用

我想我还没有完全掌握

enumerate（）

函数背后的逻辑，希望您能给我一些建议

这是我的当前脚本：

# Find keywords and "n" subsequent words in txt file
# credits to @jasonharper and @xander for previous updates
# cf. forum discussion on https://stackoverflow.com/questions/66972612/how-to-match-value-in-enumeration-to-a-keyword

import string

# function to find keywords in context
def wordsafter(keyword, source):
        wordcount=len(source) # sample text has 5953 words in total
        print(wordcount)
        res_strings=[]    
        for i in range(0, wordcount):
            if i < wordcount:
                print(i) # prints correct range from 0 to 5952
                for i, val in enumerate(source):
                    if val == keyword:
                        res_str=(' '.join(source[i:i + 10]))  # show searchterm and subsequent n words
                        res_strings.append(res_str)
            if i > wordcount:
                break # how can I force function to check each word only once?
            
        return(res_strings) # returns endless (?) list of results?
    
# open input txt file from local path
with open('C:\\somefile.txt', 'r', encoding='utf-8', errors='ignore') as f:  # open file
    data1 = f.read()  # read content of file as string
    data2 = data1.translate(str.maketrans('', '', string.punctuation)).lower()  # remove punctuation
    data3 = " ".join(data2.split())  # remove additional whitespace from text
    indata = list(data3.split())  # convert string to list

# define searchterms and call function    
searchterms = ["proclamation"] 
for keyword in searchterms:
    result = wordsafter(keyword, indata)
    if result:
        print(result[600000]) # prints a valid string although whole file only has 5953 items
        with open('C:\\Users\\anotherfile.txt', 'w', encoding="utf-8-sig") as file:
            file.write(str(result)) # output file is so large it crashes when opened

#在txt文件中查找关键字和“n”个后续单词
#之前的更新归功于@jasonharper和@xander
#参阅论坛讨论https://stackoverflow.com/questions/66972612/how-to-match-value-in-enumeration-to-a-keyword
导入字符串
#函数在上下文中查找关键字
def wordsafter（关键字，来源）：
wordcount=len（源）#示例文本共有5953个单词
打印（字数）
res_字符串=[]
对于范围内的i（0，字数）：
如果我<字数：
打印（i）#打印0到5952之间的正确范围
对于i，枚举中的val（来源）：
如果val==关键字：
res_str=（''.join（source[i:i+10]）#显示搜索词及其后的n个单词
res_strings.append（res_str）
如果我>字数：
break#如何强制函数只检查每个单词一次？
return（res_strings）#返回无限的（？）结果列表？
#从本地路径打开输入txt文件
将open（'C:\\somefile.txt'，'r'，encoding='utf-8'，errors='ignore'）作为f:#open file
data1=f.read（）#以字符串形式读取文件内容
data2=data1.translate（str.maketrans（“”，，，string.标点符号））.lower（）#删除标点符号
data3=“”.join（data2.split（））#从文本中删除额外的空白
indata=list（data3.split（））#将字符串转换为list
#定义searchterms和调用函数
searchterms=[“公告”]
对于searchterms中的关键字：
结果=wordsafter（关键字，indata）
如果结果为：
打印（结果[600000]）#打印一个有效字符串，尽管整个文件只有5953项
打开（'C:\\Users\\anotherfile.txt'，'w'，encoding=“utf-8-sig”）作为文件：
file.write（str（result））#输出文件太大，打开时会崩溃

不确定发生了什么，但在不同的机器上运行原始脚本会产生完全正确的输出，无需强制中断：

# Find keywords and five subsequent words
# Updated script with credits to @jasonharper and @xander
# cf. forum discussion on https://stackoverflow.com/questions/66972612/how-to-match-value-in-enumeration-to-a-keyword

import string

# function to find keywords in context
def wordsafter(keyword, source):
        wordcount=len(source) # shows number of words in sample text
        print(wordcount)
        res_strings=[]    
        for i, val in enumerate(source):
            if val == keyword:
                res_str=(' '.join(source[i:i + 10]))  # shows searchterm and subsequent "n" words
                res_strings.append(res_str)
            
        return(res_strings) # returns list of results
    
# open input txt file from local path
with open('C:\\Users\\input.txt', 'r', encoding='utf-8', errors='ignore') as f:  # open file
    data1 = f.read()  # read content of file as string
    data2 = data1.translate(str.maketrans('', '', string.punctuation)).lower()  # remove punctuation
    data3 = " ".join(data2.split())  # remove additional whitespace from text
    indata = list(data3.split())  # convert string to list

# define searchterms and call function    
searchterms = ["proclamation", "king"] 
for keyword in searchterms:
    result = wordsafter(keyword, indata)
    if result:
        print(result) 
        with open('C:\\Users\\output.txt', 'w', encoding="utf-8-sig") as file:
            file.write(str(result)) # write output to file

看起来你在第二个循环中定义了i。为什么不使用j或另一个变量呢？同样，如果你在一个单词列表上循环使用for循环，而不是多次专门检查单词，那么函数应该只检查单词一次，为什么你认为它会被检查多次呢？它应该一直都是相同的索引号。但老实说，只有当一个循环失败时，整个字数部分才真正出现。@matiss：因为我得到了大量的输出。当文件中有5900个字左右时，我仍然可以稍后调用output 6000。但我无法打印整个输出，因为这会导致Jupyter笔记本中出现错误。不过，我可以打印的输出看起来很棒。这就是为什么我感到困惑。这就是结果6000的样子：

逮捕约翰·格洛弗·亚历山大切割威廉·阿德耶和

的公告，我甚至可以打印输出600000份：

市长阁下的公告

，所以它永远不会结束。