Python 从bs4.element.Tag获取项目

Python 从bs4.element.Tag获取项目,python,beautifulsoup,Python,Beautifulsoup,我有类型为bs4.element.Tag的元素 <a class="nav-link match-link-stats" href="/football/matches/match867851_Kalteng_Putra-Arema-online/" title="Stat"><i class="icon-match-link"></i></a> 我想从这个元素中获得“/football/matches/match867851_Kalteng

我有类型为bs4.element.Tag的元素

<a class="nav-link match-link-stats" href="/football/matches/match867851_Kalteng_Putra-Arema-online/" title="Stat"><i class="icon-match-link"></i></a>

我想从这个元素中获得“/football/matches/match867851_Kalteng_Putra-Arema-online/”。怎么做

tag.findChild("a")['href']

您抓取“a”标记,然后获取“href”属性

此答案假设您已经将
标记
元素作为对象。如果没有,请使用KunduK的答案


您可以使用
tag.get('href')
tag['href']

>>> tag.get('href')
'/football/matches/match867851_Kalteng_Putra-Arema-online/'
>>> tag['href']
'/football/matches/match867851_Kalteng_Putra-Arema-online/'
不同之处在于,如果属性不存在,
tag.get('href')
将返回None,而在这种情况下,
tag['href']
将引发一个
KeyError
。这与dict中的行为相同

完整示例:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<a class="nav-link match-link-stats" href="/football/matches/match867851_Kalteng_Putra-Arema-online/" title="Stat"><i class="icon-match-link"></i></a>')
>>> tag = soup.find('a')
>>> type(tag)
<class 'bs4.element.Tag'>
>>> tag.get('href')
'/football/matches/match867851_Kalteng_Putra-Arema-online/'
>>> tag['href']
'/football/matches/match867851_Kalteng_Putra-Arema-online/'
>>来自bs4导入组
>>>汤=美汤(“”)
>>>tag=soup.find('a')
>>>类型(标签)
>>>tag.get('href')
“/football/matches/match867851_Kalteng_Putra-Arema-online/”
>>>标签['href']
“/football/matches/match867851_Kalteng_Putra-Arema-online/”

使用css selecor并获取属性
href

from bs4 import BeautifulSoup

data='''<a class="nav-link match-link-stats" href="/football/matches/match867851_Kalteng_Putra-Arema-online/" title="Stat"><i class="icon-match-link"></i></a>'''

soup= BeautifulSoup(data, 'html.parser')
print(soup.select_one('.match-link-stats')['href'])

此外,不应再使用旧的非常规名称–
findChild()
现在只调用
find()
/football/matches/match867851_Kalteng_Putra-Arema-online/