Python 使用beautifulSoup和print访问属性_Python_Beautifulsoup

Python 使用beautifulSoup和print访问属性

python

Python 使用beautifulSoup和print访问属性,python,beautifulsoup,Python,Beautifulsoup,我想抓取一个站点来查找h2标签的标题属性 <h2 class="1"><a href="http://example.it/Titanic_Caprio.html" title="Titanic Caprio">Titanic_Caprio</a></h2> 使用findAll'h2'，attrs={'title'}没有结果。我做错了什么？如何在文件中打印整个标题列表您需要在attrs中传递键值对 findAll('h2', attr

我想抓取一个站点来查找h2标签的标题属性

     <h2 class="1"><a href="http://example.it/Titanic_Caprio.html" title="Titanic Caprio">Titanic_Caprio</a></h2>

使用findAll'h2'，attrs={'title'}没有结果。我做错了什么？如何在文件中打印整个标题列表

您需要在attrs中传递键值对

findAll('h2', attrs = {"key":"value"})

问题在于title不是h2标记的属性，而是其中包含的标记的属性。因此，必须首先搜索标记，然后搜索具有title属性的子标记：

titles = []
h2_list = links = soup.findAll('h2')
for h2 in h2_list:
    titles.extend(h2.findAll(lambda x: x.has_attr('title')))

它之所以有效，是因为BeautifulSoup可以使用函数作为搜索过滤器。

可以有我可以测试的url！谢谢你花时间！Lambda也是一个很好的解决方案！

titles = []
h2_list = links = soup.findAll('h2')
for h2 in h2_list:
    titles.extend(h2.findAll(lambda x: x.has_attr('title')))