Python 如何从<；中提取单个文本；a>；用漂亮的汤做标签？_Python_Web Scraping_Beautifulsoup

Python 如何从<；中提取单个文本；a>；用漂亮的汤做标签？

python web-scraping

Python 如何从<；中提取单个文本；a>；用漂亮的汤做标签？,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,一个标记中有3个文本，但我只需要提取一个标记下面是我写的代码导入请求从bs4导入BeautifulSoup source=requests.get（'eg.com'） soup=BeautifulSoup（源，“lxml”） article=soup.find（'div'，class='content'） b=第条li.a.文本返回标记内的所有文本，输出：苹果公司 2个iteams 但我只需要第一个文本，即苹果 HTML代码如下 <li class ="iteam&

一个标记中有3个文本，但我只需要提取一个标记下面是我写的代码

导入请求
从bs4导入BeautifulSoup
source=requests.get（'eg.com'）
soup=BeautifulSoup（源，“lxml”）
article=soup.find（'div'，class='content'）
b=第条li.a.文本

返回标记内的所有文本，输出：

苹果公司 2个iteams 但我只需要第一个文本，即苹果 HTML代码如下

<li class ="iteam">
   <a href="eg.com">
      " Apple "
       <span class ="count">
          ::before
          "2"
          <span class ="countlabel">items</span>
          ::after
       </span>
   </a>
</li>

您可以使用

.contents[0]

获取

标记中的第一个内容（在本例中，文本为

“Apple”

）：

用于从标记中提取文本的代码在哪里？下面是代码：从bs4导入请求import BeautifulSoup source=requests.get（'eg.com'）soup=BeautifulSoup（source，'lxml'）article=soup.find（'div'，class='content'））b=article.li.a.text将代码放在问题中，而不是注释中，并且不要删除

HTML

。我刚刚重新编辑了格式

from bs4 import BeautifulSoup


html_doc = """
<li class ="iteam">
<a href="eg.com">
" Apple "
<span class ="count">
::before
"2"
<span class ="countlabel">
items</span>
::after
</span>
</a>
</li>"""

soup = BeautifulSoup(html_doc, "html.parser")

txt = soup.find(class_="iteam").a.contents[0].strip()
print(txt)