Python 使用BeautifulSoup提取标记值_Python_Parsing_Tags_Beautifulsoup

Python 使用BeautifulSoup提取标记值

python parsing tags

Python 使用BeautifulSoup提取标记值,python,parsing,tags,beautifulsoup,Python,Parsing,Tags,Beautifulsoup,有人能告诉我如何使用BeautifulSoup提取标记的值吗？我读了文档，但很难浏览。例如，如果我有： <span title="Funstuff" class="thisClass">Fun Text</span> Fun文本我如何通过BeautifulSoup/Python来拉“Funstuff”呢编辑：我使用的是3.2.1版您需要一些东西来识别您正在寻找的元素，并且很难说在这个问题中它是什么例如，它们都将在BeautifulSoup3中打印“Funstuf

有人能告诉我如何使用BeautifulSoup提取标记的值吗？我读了文档，但很难浏览。例如，如果我有：

<span title="Funstuff" class="thisClass">Fun Text</span>

Fun文本

我如何通过BeautifulSoup/Python来拉“Funstuff”呢

编辑：我使用的是3.2.1版

您需要一些东西来识别您正在寻找的元素，并且很难说在这个问题中它是什么

例如，它们都将在BeautifulSoup3中打印“Funstuff”。一个查找span元素并获取标题，另一个查找具有给定类的span。有许多其他有效的方法可以达到这一点

import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup('<html><body><span title="Funstuff" class="thisClass">Fun Text</span></body></html>')
print soup.html.body.span['title']
print soup.find('span', {"class": "thisClass"})['title']

导入美化组
soup=beautifulsou.beautifulsou（'Fun Text'）
print soup.html.body.span['title']
打印soup.find（'span'，{“class”：“thisClass”}）['title']

A标签子项可通过.contents获得在您的例子中，您可以发现标记正在使用其CSS类来提取内容

from bs4 import BeautifulSoup
soup=BeautifulSoup('<span title="Funstuff" class="thisClass">Fun Text</span>')
soup.select('.thisClass')[0].contents[0]

从bs4导入美化组
soup=BeautifulSoup（“有趣的文本”）
soup.select（'.thisClass'）[0]。内容[0]

所有细节都不需要吗？这是BeautifulSoup 3还是BeautifulSoup 4？问题：我对BeautifulSoup的导入语句是：从BeautifulSoup导入BeautifulSoup，CData然而，上面的代码似乎只有在我导入BeautifulSoup时才起作用。你知道为什么吗？这只是Python。如果正在执行相对导入（

从BeautifulSoup导入BeautifulSoup

），请将行从

soup=BeautifulSoup.BeautifulSoup（…

）更改为

soup=BeautifulSoup（…

），以了解更多信息。