Python 使用BeautifulSoup提取两个节点之间的同级节点_Python_Beautifulsoup

Python 使用BeautifulSoup提取两个节点之间的同级节点

python

Python 使用BeautifulSoup提取两个节点之间的同级节点,python,beautifulsoup,Python,Beautifulsoup,我有这样一份文件： <p class="top">I don't want this</p> <p>I want this</p> <table>  </table> <img ... /> <p> and all that stuff too</p> <p class="end>But not this and nothing

我有这样一份文件：

<p class="top">I don't want this</p>

<p>I want this</p>
<table>
   <!-- ... -->
</table>

<img ... />

<p> and all that stuff too</p>

<p class="end>But not this and nothing after it</p>

我不想要这个
我想要这个
还有那些东西
node.nextSibling
属性是您的解决方案：
from BeautifulSoup import BeautifulSoup

soup = BeautifulSoup(html)

nextNode = soup.find('p', {'class': 'top'})
while True:
    # process
    nextNode = nextNode.nextSibling
    if getattr(nextNode, 'name', None)  == 'p' and nextNode.get('class', None) == 'end':
        break

这个复杂的条件是确保您访问的是HTML标记的属性，而不是字符串节点。
我知道这个答案很旧，但已经过时了。你看，我也觉得它不和谐，甚至很危险。如果由于某种原因网站结构发生变化，并且不再有任何带有classend
的标记，那么当下一个同级成为None