Python 用漂亮的汤抓取家长id标签？_Python_Beautifulsoup

Python 用漂亮的汤抓取家长id标签？

python

Python 用漂亮的汤抓取家长id标签？,python,beautifulsoup,Python,Beautifulsoup,我从一个网站抓取了一堆链接，并将它们打印到一个列表中，但为了使列表更具可读性，我需要抓取链接父标签，但我不知道如何做我从中抓取的页面如下所示 <div id=bunch_of_links_1> <a href=link 1> <a href=link 2> <a href=link etc> </div> <div id=another_bunch_of_links_1> <a href=another_link

我从一个网站抓取了一堆链接，并将它们打印到一个列表中，但为了使列表更具可读性，我需要抓取链接父标签，但我不知道如何做

我从中抓取的页面如下所示

<div id=bunch_of_links_1>
<a href=link 1>
<a href=link 2>
<a href=link etc> 
</div>
<div id=another_bunch_of_links_1>
<a href=another_link 1>
<a href=another_link 2>
<a href=another_link etc> 
</div>

然后使用for循环打印它们。如何获取每个链接的div id并将其与链接一起打印

编辑-我不确定在链接中为l插入[（l，l.parent.get（'id'）]

这是我的密码

links = soup.findAll(href=re.compile("javascript"))

for link in links:
full_link = link.get('href')
names = link.contents[0]
print "+names+", "+full_link+"

我希望能够与其他人一起打印Id标签

编辑2

我把这个放在我的for循环中

 idtag = link.parent.get('id')

当我打印idtag var时，它不会给我任何错误，因为它返回none

BeautifulSoup中的每个元素都有一个指向父元素的

.parent

属性。在这里使用：

[(l, l.parent.get('id')) for l in links]

演示：

>>来自bs4导入组
>>>汤=美汤（“”）\
... 
…，“一堆链接”，“一堆链接”，“一堆链接”，“一堆链接”，“一堆链接”，“另一堆链接”，“另一堆链接”

@user3332151:您的帖子内容太少，无法对此发表评论。底线：您可以在标记对象上使用

.parent

来获取父对象。如何在代码中使用它取决于您。

[(l, l.parent.get('id')) for l in links]

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''\
... <div id=bunch_of_links_1>
... <a href=link 1>
... <a href=link 2>
... <a href=link etc> 
... </div>
... <div id=another_bunch_of_links_1>
... <a href=another_link 1>
... <a href=another_link 2>
... <a href=another_link etc> 
... </div>
... ''')
>>> 
>>> links = soup.find_all('a')
>>> [(l, l.parent.get('id')) for l in links]
[(<a href="link">
</a>, 'bunch_of_links_1'), (<a href="link">
</a>, 'bunch_of_links_1'), (<a etc="" href="link">
</a>, 'bunch_of_links_1'), (<a href="another_link">
</a>, 'another_bunch_of_links_1'), (<a href="another_link">
</a>, 'another_bunch_of_links_1'), (<a etc="" href="another_link">
</a>, 'another_bunch_of_links_1')]