Python 在课堂上用漂亮的汤寻找琴弦
我正在做课堂作业。我们必须收集类似以下内容的在线图书列表信息:Python 在课堂上用漂亮的汤寻找琴弦,python,beautifulsoup,Python,Beautifulsoup,我正在做课堂作业。我们必须收集类似以下内容的在线图书列表信息: <p class="css-38z03z"><strong>1. <a data-link-name="in body link" href="https://www.theguardian.com/books/2016/feb/01/100-best-nonfiction-books-of-all-time-the-sixth-extinction-e
<p class="css-38z03z"><strong>1. <a data-link-name="in body link" href="https://www.theguardian.com/books/2016/feb/01/100-best-nonfiction-books-of-all-time-the-sixth-extinction-elizabeth-kolbert">The Sixth Extinction by Elizabeth Kolbert (2014)</a> </strong><br/> An` `engrossing account of the looming catastrophe caused by ecology’s “neighbours from hell” – mankind.</p>
我试过使用其他不同的
兄弟标记,但没有成功。我该怎么办?只需使用。下一步:
from bs4 import BeautifulSoup
html = '''<p class="css-38z03z"><strong>1. <a data-link-name="in body link" href="https://www.theguardian.com/books/2016/feb/01/100-best-nonfiction-books-of-all-time-the-sixth-extinction-elizabeth-kolbert">The Sixth Extinction by Elizabeth Kolbert (2014)</a> </strong><br/> An engrossing account of the looming catastrophe caused by ecology’s “neighbours from hell” – mankind.</p>
'''
soup = BeautifulSoup(html, "html.parser")
print(soup.select_one('.css-38z03z br').next)
这适用于这个特定的示例,但不确定它在您使用的整个范围内是否稳定
from bs4 import BeautifulSoup
html = """
<p class="css-38z03z">
<strong>1.
<a data-link-name="in body link" href="https://www.theguardian.com/books/2016/feb/01/100-best-nonfiction-books-of-all-time-the-sixth-extinction-elizabeth-kolbert">The Sixth Extinction by Elizabeth Kolbert (2014)
</a>
</strong>
<br/> An engrossing account of the looming catastrophe caused by ecology’s “neighbours from hell” – mankind.
</p>"""
soup = BeautifulSoup(html, 'html.parser')
element_all = soup.find('p').text
element_unwanted = soup.find('strong').text
if element_unwanted in element_all:
element = element_all.replace(element_unwanted, '').strip()
print(element)
从bs4导入美化组
html=”“”
1。
一篇引人入胜的关于生态“地狱邻居”——人类——即将发生的灾难的报道。
“”“
soup=BeautifulSoup(html,'html.parser')
元素\u all=soup.find('p')。文本
元素\u不需要=soup.find('strong')。文本
如果元素\中不需要元素\全部:
element=element\u all.replace(element\u多余的“”).strip()
打印(元素)
An engrossing account of the looming catastrophe caused by ecology’s “neighbours from hell” – mankind.
from bs4 import BeautifulSoup
html = """
<p class="css-38z03z">
<strong>1.
<a data-link-name="in body link" href="https://www.theguardian.com/books/2016/feb/01/100-best-nonfiction-books-of-all-time-the-sixth-extinction-elizabeth-kolbert">The Sixth Extinction by Elizabeth Kolbert (2014)
</a>
</strong>
<br/> An engrossing account of the looming catastrophe caused by ecology’s “neighbours from hell” – mankind.
</p>"""
soup = BeautifulSoup(html, 'html.parser')
element_all = soup.find('p').text
element_unwanted = soup.find('strong').text
if element_unwanted in element_all:
element = element_all.replace(element_unwanted, '').strip()
print(element)