Python 使用Beauty Soup从td元素提取URL
我正在尝试从html表中提取URL。URL位于td单元内的锚定标记内。 html看起来像:Python 使用Beauty Soup从td元素提取URL,python,beautifulsoup,Python,Beautifulsoup,我正在尝试从html表中提取URL。URL位于td单元内的锚定标记内。 html看起来像: <table width="100%" border="0" cellspacing="0" cellpadding="0" name="TabName" id="Tab" class="common-table"> <tr> <td>Acme Company</a><br/><span class="f-10"&g
<table width="100%" border="0" cellspacing="0" cellpadding="0" name="TabName" id="Tab" class="common-table">
<tr>
<td>Acme Company</a><br/><span class="f-10">07-11-2016</span></td>
<td><span>Vendor</span><br>
<td><a href="http://URL" title="Report Details">Details</a></td>
</tr>
</table>
我一直遍历到最后一个带有URL的td标记。现在如何提取URL
提前感谢您的时间。就像这样:
url=each_td.a['href']
像这样:
url=each_td.a['href']
from bs4 import BeautifulSoup
import requests
import re
r = requests.get('http://SourceURL')
soup = BeautifulSoup(r.content,"html.parser")
# Find table
table = soup.find("table",{"class": "common-table"})
# Find all tr rows
tr = table.find_all("tr")
for each_tr in tr:
td = each_tr.find_all('td')
# In each tr rown find each td cell
for each_td in td:
print(each_td.text)
if(each_td.text == "Details"):