Python 使用beautifulsoup解析HTML表
我想得到这样的信息,特别是强标签后的文字,如何通过beautifulsoup做到这一点,谢谢Python 使用beautifulsoup解析HTML表,python,html,html-parsing,beautifulsoup,Python,Html,Html Parsing,Beautifulsoup,我想得到这样的信息,特别是强标签后的文字,如何通过beautifulsoup做到这一点,谢谢 <span class=JamalsinRed>H.S. AHMED ALLY</span> <hr align="left" width="400" color="#CCCCCC"> <strong>Address : </strong>217/7,Saleh Market,Adamjee Road,Saddar&
<span class=JamalsinRed>H.S. AHMED ALLY</span>
<hr align="left" width="400" color="#CCCCCC">
<strong>Address : </strong>217/7,Saleh Market,Adamjee Road,Saddar<br>
<strong>City : </strong>Rawalpindi<br>
<strong>Phone # : </strong>(92 51) 5511748, 5125396<br>
<strong>Fax : </strong>(92 51) 5511749<br>
<strong>E-mail : </strong><a class=b href='mailto:hsaforce@cyber.net.pk'>hsaforce@cyber.net.pk</a><br><strong>Web : </strong>
<a target=_blank href='http://www.hsahmedally.com'>www.hsahmedally.com</a><br>
H.S.艾哈迈德联盟
地址:217/7,萨达尔阿达姆吉路萨利赫市场
城市:拉瓦尔品第
电话:(92 51)5511748、5125396
传真:(9251)5511749
电子邮件:
网络:
这是一个很好的可重用函数,它以字段名作为参数并输出字段值。函数将搜索strong
元素,以传入的字段名开头,然后:
from bs4 import BeautifulSoup
data = """
<div>
<span class="JamalsinRed">H.S. AHMED ALLY</span>
<hr align="left" width="400" color="#CCCCCC">
<strong>Address : </strong>217/7,Saleh Market,Adamjee Road,Saddar<br>
<strong>City : </strong>Rawalpindi<br>
<strong>Phone # : </strong>(92 51) 5511748, 5125396<br>
<strong>Fax : </strong>(92 51) 5511749<br>
<strong>E-mail : </strong><a class=b href='mailto:hsaforce@cyber.net.pk'>hsaforce@cyber.net.pk</a><br><strong>Web : </strong>
<a target=_blank href='http://www.hsahmedally.com'>www.hsahmedally.com</a><br>
</hr>
</div>
"""
def get_field_value(soup, field):
return soup.find('strong', text=lambda x: x.startswith(field)).next_sibling
soup = BeautifulSoup(data)
print get_field_value(soup, 'Address')
print get_field_value(soup, 'City')
217/7,Saleh Market,Adamjee Road,Saddar
Rawalpindi