Python 循环在列表中的第一项之后停止 places=[] 人员=[] 未知=[] 新列表=[] filename='file.html' tree=etree.parse（文件名）输入文件=打开（文件名为'rU'） def提取（树）：归宿返回人员返回未知 def change_类（）：摘录（树）对于输入_文件中的行：对于x位置：对于未知的z：如果行中有x+“”：换行符=line.replace（'person'，'place'） newlist.append（换行符）行中的elif z+''：换行符=line.replace（“'person'”、“'undefined'”） newlist.append（换行符）其他： newlist.append（行）打破打破对于新列表中的x：打印x_Python_List_Loops_Iteration

Python 循环在列表中的第一项之后停止 places=[] 人员=[] 未知=[] 新列表=[] filename='file.html' tree=etree.parse（文件名）输入文件=打开（文件名为'rU'） def提取（树）：归宿返回人员返回未知 def change_类（）：摘录（树）对于输入_文件中的行：对于x位置：对于未知的z：如果行中有x+“”：换行符=line.replace（'person'，'place'） newlist.append（换行符）行中的elif z+''：换行符=line.replace（“'person'”、“'undefined'”） newlist.append（换行符）其他： newlist.append（行）打破打破对于新列表中的x：打印x

python list loops

Python 循环在列表中的第一项之后停止 places=[] 人员=[] 未知=[] 新列表=[] filename='file.html' tree=etree.parse（文件名）输入文件=打开（文件名为'rU'） def提取（树）：归宿返回人员返回未知 def change_类（）：摘录（树）对于输入_文件中的行：对于x位置：对于未知的z：如果行中有x+“”：换行符=line.replace（'person'，'place'） newlist.append（换行符）行中的elif z+''：换行符=line.replace（“'person'”、“'undefined'”） newlist.append（换行符）其他： newlist.append（行）打破打破对于新列表中的x：打印x,python,list,loops,iteration,Python,List,Loops,Iteration,我有一个此类html文件，其中包含错误的类值：纽约约翰·多伊巴黎无名氏我的脚本允许我重新打印同一文件，但它仅替换两个列表的第一项（位置和未知）的类值：纽约约翰·多伊巴黎无名氏然后它停止对这两个列表的迭代，直接转到else步骤，将所有剩余的内容添加到newlist中，而不进行替换。Python yelds没有错误，列表也使用extract（）函数成功提取，我检查了像这样的东西可能有用我删除了我的另一个答案，因为它试图解决一个你没有的问题。我知道你已经接受了答案，但也

我有一个此类html文件，其中包含错误的类值：

纽约约翰·多伊巴黎无名氏

我的脚本允许我重新打印同一文件，但它仅替换两个列表的第一项（位置和未知）的类值：


纽约
约翰·多伊
巴黎
无名氏

然后它停止对这两个列表的迭代，直接转到else步骤，将所有剩余的内容添加到newlist中，而不进行替换。Python yelds没有错误，列表也使用extract（）函数成功提取，我检查了

像这样的东西可能有用

我删除了我的另一个答案，因为它试图解决一个你没有的问题。我知道你已经接受了答案，但也要看看BeautifulSoup解决方案

known_places = #list of known places
unkowns = #list of unknown places and persons

newlist = []
for line in input_file:
    if any(place in line for place in Known_places):
        line = line.replace("person", "place")
    elif any(unkown in line for unkown in unkowns):
        line = line.replace("person","undefined")
    newlist.append(line)

从bs4导入美化组
地点=[“纽约”、“巴黎”]#等
人物=[“约翰·多伊”、“简·多伊”]等
soup=BeautifulSoup（打开（'file.txt'））
段落=汤（“p”）#抓住所有…元素
对于段落中的p：
如果p.dfn.string位于以下位置：
p['class']='place'
elif p.dfn.string中的人物：
p['class']='person'

str（soup）

现在是您的HTML文档，可以根据要求进行修改。

为什么不使用正则表达式呢？您正在无条件地打破两个内部循环。那么为什么你会期望不止一个循环呢？我应该在哪里使用它们呢？如果没有两个中断，就不要使用它们。它将进入无限循环谢谢你的回答！看起来很有趣，我以前从未使用过BeautifulSoup…@elaine_blath BeautifulSoup是一个了不起的XML/HTML解析器。我自己还在学习，但应用程序令人难以置信！：）

 <html>
  <head></head>
  <body>
    <p class ='person'><dfn>New-York</dfn>
    <p class = 'place'><dfn>John Doe</dfn>
    <p class ='person'><dfn>Paris</dfn>
    <p class = 'place'><dfn>Jane Doe</dfn>
  </body>
</html>

 <html>
  <head></head>
  <body>
    <p class ='place'><dfn>New-York</dfn>
    <p class = 'unknown'><dfn>John Doe</dfn>
    <p class ='person'><dfn>Paris</dfn>
    <p class = 'place'><dfn>Jane Doe</dfn>
  </body>
</html>

known_places = #list of known places
unkowns = #list of unknown places and persons

newlist = []
for line in input_file:
    if any(place in line for place in Known_places):
        line = line.replace("person", "place")
    elif any(unkown in line for unkown in unkowns):
        line = line.replace("person","undefined")
    newlist.append(line)

from bs4 import BeautifulSoup

PLACES = ["New-York","Paris"] # etc
PEOPLE = ["John Doe","Jane Doe"] # etc

soup = BeautifulSoup(open('file.txt'))
paragraphs = soup("p") # grabs all the <p>...</p> elements
for p in paragraphs:
    if p.dfn.string in PLACES:
        p['class'] = 'place'
    elif p.dfn.string in PEOPLE:
        p['class'] = 'person'