如何在Python中通过文本获取href链接

如何在Python中通过文本获取href链接,python,Python,这是web html内容的一部分: <a href="https://www.cnbeta.com/articles/science/1062069.htm"><strong>阅读全文</strong></a> 您可能想试试beautifulsou 例如: from bs4 import BeautifulSoup sample_html = """ <a href="https

这是web html内容的一部分:

<a href="https://www.cnbeta.com/articles/science/1062069.htm"><strong>阅读全文</strong></a>

您可能想试试
beautifulsou

例如:

from bs4 import BeautifulSoup

sample_html = """
<a href="https://www.cnbeta.com/articles/science/1062069.htm"><strong>阅读全文</strong></a>
<a href="https://www.cnbeta.com/articles/science/1062068.htm"><strong>RANDOM TEXT!</strong></a>
"""

soup = BeautifulSoup(sample_html, "html.parser").find_all(lambda t: t.name == "a" and t.text.startswith("阅"))

print([a["href"] for a in soup])


你有密码吗?如何解析HTML?这是否回答了您的问题?不,它可能会找到所有带有a的标签,我只想找到one@TomerikooThe最简单的方法是循环所有
标记,一旦找到包含此文本的标记,就停止。
from bs4 import BeautifulSoup

sample_html = """
<a href="https://www.cnbeta.com/articles/science/1062069.htm"><strong>阅读全文</strong></a>
<a href="https://www.cnbeta.com/articles/science/1062068.htm"><strong>RANDOM TEXT!</strong></a>
"""

soup = BeautifulSoup(sample_html, "html.parser").find_all(lambda t: t.name == "a" and t.text.startswith("阅"))

print([a["href"] for a in soup])

['https://www.cnbeta.com/articles/science/1062069.htm']