Python 在美丽的汤中隔离div_Python_Beautifulsoup

Python 在美丽的汤中隔离div

python

Python 在美丽的汤中隔离div,python,beautifulsoup,Python,Beautifulsoup,在BeautifulSoup和Python方面有点问题我正试图把一本书中的标题分离出来 < a href="">TITLE<?/a> 这是完美的，但是，有一个div引起了一个问题。通常的做法是： <div class='box'> 尴尬的是： <div class='box sponsored'> 如何仅选择第一个框而不选择发起的框谢谢beautifulsou拥有：请记住，单个标记的“类”可以有多个值属性当您搜索与某个CSS

在BeautifulSoup和Python方面有点问题

我正试图把一本书中的标题分离出来

 < a href="">TITLE<?/a>

这是完美的，但是，有一个div引起了一个问题。通常的做法是：

<div class='box'>

尴尬的是：

<div class='box sponsored'>

如何仅选择第一个框而不选择发起的框

谢谢

beautifulsou

拥有：

请记住，单个标记的“类”可以有多个值属性当您搜索与某个CSS类匹配的标记时，您正在与它的任何CSS类进行匹配

使用单个

框

类查看

div

元素时，强制执行

beautifulsou

的一种方法是使用以下内容：

演示：

>>来自bs4导入组
>>> 
>>>data=”“”
... 
…测试1
... 
... 
…测试2
... 
... """
>>> 
>>>soup=BeautifulSoup（数据'html'）
>>> 
>>>对于汤中的div。选择（'div[class=box]'）：
...     打印div.text.strip（）
... 
测试1

<div class='box sponsored'>

soup.select('div[class=box]')

>>> from bs4 import BeautifulSoup
>>> 
>>> data = """
... <div class='box'>
...     test1
... </div>
... <div class='box sponsored'>
...     test2
... </div>
... """
>>> 
>>> soup = BeautifulSoup(data, 'html')
>>> 
>>> for div in soup.select('div[class=box]'):
...     print div.text.strip()
... 
test1