Python 带有标题标记的BS4类_Python_Beautifulsoup_Python 3.8

Python 带有标题标记的BS4类

python

Python 带有标题标记的BS4类,python,beautifulsoup,python-3.8,Python,Beautifulsoup,Python 3.8,我如何解析一个类以只获取标题标记之外的文本，或者在列表中同时获取两者 <div class="footballMatchSummaryDef"><h1>Burnley v Aston Villa</h1>English Premier League at Turf Moor</div> Burnley诉阿斯顿维拉英格兰超级联赛我曾考虑过使用正则表达式进行提取，但认为beautiful soup必须能够处理它有很多解决方案，一种是获取全文，然后

我如何解析一个类以只获取标题标记之外的文本，或者在列表中同时获取两者

<div class="footballMatchSummaryDef"><h1>Burnley v Aston Villa</h1>English Premier League at Turf Moor</div>

Burnley诉阿斯顿维拉英格兰超级联赛

我曾考虑过使用正则表达式进行提取，但认为beautiful soup必须能够处理它

有很多解决方案，一种是获取全文，然后根据某个分隔符进行拆分：

from bs4 import BeautifulSoup

txt = '''<div class="footballMatchSummaryDef"><h1>Burnley v Aston Villa</h1>English Premier League at Turf Moor</div>'''

soup = BeautifulSoup(txt, 'html.parser')

lst = soup.select_one('.footballMatchSummaryDef').get_text(separator='|').split('|')
print(lst)

或使用bs4导航功能：

print( soup.h1.text )
print( soup.h1.find_next_sibling(text=True) )

印刷品：

['Burnley v Aston Villa', 'English Premier League at Turf Moor']

Burnley v Aston Villa
English Premier League at Turf Moor

感谢Andrej，导航功能正是我想要的，工作很好，只是刚刚开始使用python，所以对我来说，php是全新的。这是我需要的下一个兄弟姐妹的东西

print( soup.h1.find_next_sibling(text=True) )