Python 使用beautifulSoup解析网站数据_Python_Parsing_Beautifulsoup

Python 使用beautifulSoup解析网站数据

python parsing

Python 使用beautifulSoup解析网站数据,python,parsing,beautifulsoup,Python,Parsing,Beautifulsoup,我想从一个站点中提取一个特定的数字 String1 :  112.674448  handle String2 :  60.90402  handle <

我想从一个站点中提取一个特定的数字

String1&nbsp;:&nbsp; <font style="color:#EE6564;" > 
112.674448 </font>&nbsp;handle <br/>

String2&nbsp;:&nbsp; <font style="color:#EE6564;" > 
60.90402 </font>&nbsp;handle  <br/>

String3&nbsp;:&nbsp; <font style="color:#EE6564;" > 
51.770428 </font>&nbsp;handle  <br/>

String4&nbsp;:&nbsp; <font style="color:#EE6564;" > 
182712 </font>&nbsp;handle  <br/>

但这并没有找到比这更好的东西

soup.findAll(text="String1")

首先，您可以为字体标记指定一个唯一的ID，然后您可以执行以下操作：

from BeautifulSoup import BeautifulSoup

html = \
"""
String1&nbsp;:&nbsp; <font style="color:#EE6564;" > 
112.674448 </font>&nbsp;handle <br/>

String2&nbsp;:&nbsp; <font id="font1" style="color:#EE6564;" > 
60.90402 </font>&nbsp;handle  <br/>

String3&nbsp;:&nbsp; <font style="color:#EE6564;" > 
51.770428 </font>&nbsp;handle  <br/>

String4&nbsp;:&nbsp; <font style="color:#EE6564;" > 
182712 </font>&nbsp;handle  <br/>"""

soup = BeautifulSoup(html)

soup.findAll("font",id="font1")

从美化组导入美化组
html=\
"""
第1条：
112.674448手柄

第2条：
60.90402手柄

第3条：
51.770428手柄

第4条：
182712手柄
“
soup=BeautifulSoup（html）
soup.findAll（“font”，id=“font1”）

效果很好！抱歉，作为一个noob，但是我如何获取标记所保存的数据呢？对于soup中的标记。findAll（“font”，style=“color:6564；”）：print tag.contents print tag.contents[2]尝试以下操作：

soup.findAll（'font'，id='font1'）[0]。contents[0].strip（）

from BeautifulSoup import BeautifulSoup

html = \
"""
String1&nbsp;:&nbsp; <font style="color:#EE6564;" > 
112.674448 </font>&nbsp;handle <br/>

String2&nbsp;:&nbsp; <font id="font1" style="color:#EE6564;" > 
60.90402 </font>&nbsp;handle  <br/>

String3&nbsp;:&nbsp; <font style="color:#EE6564;" > 
51.770428 </font>&nbsp;handle  <br/>

String4&nbsp;:&nbsp; <font style="color:#EE6564;" > 
182712 </font>&nbsp;handle  <br/>"""

soup = BeautifulSoup(html)

soup.findAll("font",id="font1")