Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 如何解析没有';使用BeautifulSoup时不能使用类/id_Python 2.7_Beautifulsoup - Fatal编程技术网

Python 2.7 如何解析没有';使用BeautifulSoup时不能使用类/id

Python 2.7 如何解析没有';使用BeautifulSoup时不能使用类/id,python-2.7,beautifulsoup,Python 2.7,Beautifulsoup,我有这样的html代码 <div style="width:200px"> <h2> My name1 </h2> DOB:17-6-1991 <br> person details, person details,person details <div></div> <h2> My name2</h2> DOB:18-6-1991

我有这样的html代码

<div style="width:200px">
    <h2> My name1 </h2>
     DOB:17-6-1991
    <br>
    person details, person details,person details
    <div></div>
    <h2> My name2</h2>
     DOB:18-6-1991
    <br>
    person details, person details,person details
    <div></div>
    <h2> My name3 </h2>
     DOB:19-6-1991
    <br>
    person details, person details,person details
    <div></div>
    <h2> My name4 </h2>
     DOB:20-6-1991
    <br>
    person details, person details,person details
    <div></div>
    <h2> My name5 </h2>
     DOB:21-6-1991
    <br>
    person details, person details,person details
    <div></div>
</div>        
My name1
17-6-1991
person details, person details,person details

My name2
18-6-1991
person details, person details,person details
.
.
.
.
so on

请帮我解决这个问题

有很多方法可以解决你的问题。我选择在一个循环中迭代h2元素,然后在另一个循环中迭代它们的兄弟元素。当我遇到另一个h2时,我打破了内部循环。我没有删除空格。您可以使用Python方法,例如
rtrim
ltrim
,来实现这一点。你可以用
字符串去掉“DOB:”。替换

from bs4 import BeautifulSoup
from bs4 import NavigableString

s = """your HTML here"""

soup = BeautifulSoup(s)
headers = soup.find_all("h2")
for h in headers:
   print h.text
   for s in h.next_siblings:
      if s.name == "h2":
         break
      elif isinstance(s, NavigableString):
         print s.string