Python 使用BeautifulSoup在HTML标记内选择多个值

Python 使用BeautifulSoup在HTML标记内选择多个值,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我创建了一个包含多个代码块的HTML页面,如下所示: <div data-pnref="all" class="clearfix _5qo4"> <a data-hovercard="/ajax/hovercard/user.php?id=671948073& amp;extragetparams=%7B%22hc_location%22%3A%22friends_tab%22%7D" ... /> 我想检索数据悬停卡的值,尤其是URL中的id:“67194

我创建了一个包含多个代码块的HTML页面,如下所示:

<div data-pnref="all" class="clearfix _5qo4">
<a data-hovercard="/ajax/hovercard/user.php?id=671948073&
amp;extragetparams=%7B%22hc_location%22%3A%22friends_tab%22%7D" ... />

我想检索
数据悬停卡的值,尤其是URL中的id:“671948073”

我在BeautifulSoup模块中尝试了findAll和select,但至今未成功。

找到
,然后找到


是的,但我检索整个块,然后我无法提取id
html = '<div data-pnref="all" class="clearfix _5qo4"><a data-hovercard="/ajax/hovercard/user.php?id=671948073&amp;extragetparams=%7B%22hc_location%22%3A%22friends_tab%22%7D"/></div>'
soup = BeautifulSoup(html)

div = soup.find('div')
anchor = div.find('a')

data_hovercard = anchor['data-hovercard']

print data_hovercard
#/ajax/hovercard/user.php?id=671948073&extragetparams=%7B%22hc_location%22%3A%22friends_tab%22%7D
import urlparse

parsed = urlparse.urlparse(data_hovercard)
parsed_dict = urlparse.parse_qs(parsed.query)
hovercard_id = parsed_dict['id']

print hovercard_id
#['671948073']