Python 使用BeautifulSoup提取第一个子标记之前的文本_Python_Beautifulsoup

Python 使用BeautifulSoup提取第一个子标记之前的文本

python

Python 使用BeautifulSoup提取第一个子标记之前的文本,python,beautifulsoup,Python,Beautifulsoup,来自此html源： <div class="category_link"> Category: <a href="/category/personal">Personal</a> </div> 我希望“文本节点”作为第一个子节点可用。关于如何解决这个问题，有什么建议吗？我很肯定下面的内容可以满足您的要求 parsed.find('a').previousSibling # or something like that 这将返回一个几乎相

来自此html源：

<div class="category_link">
  Category:
  <a href="/category/personal">Personal</a>
</div>

我希望“文本节点”作为第一个子节点可用。关于如何解决这个问题，有什么建议吗？

我很肯定下面的内容可以满足您的要求

parsed.find('a').previousSibling # or something like that

这将返回一个几乎相同的

navigablesting

实例作为

unicode

实例，但您可以调用

unicode

来获取 unicode对象

我看看能不能测试一下，然后告诉你

编辑：我刚刚确认它可以工作：

>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup('<div class=a>Category: <a href="/">a link</a></div>')
>>> soup.find('a')
<a href="/">a link</a>
>>> soup.find('a').previousSibling
u'Category: '
>>>

>>从BeautifulSoup导入BeautifulSoup
>>>soup=BeautifulSoup（'类别：'）
>>>soup.find（'a'））
>>>soup.find（'a'）。上一个兄弟姐妹
u'类别：'
>>>

已解析分区内容[0]

>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup('<div class=a>Category: <a href="/">a link</a></div>')
>>> soup.find('a')
<a href="/">a link</a>
>>> soup.find('a').previousSibling
u'Category: '
>>>