在Beauty soup python中find()和find_all()之间有什么区别?
我在做网页抓取,但我在find()和find_all()中卡住了/弄糊涂了 例如在何处使用find_all,在何处使用user find() 此外,在哪里可以使用这种方法,如在for loop或ul li列表中 以下是我尝试的代码在Beauty soup python中find()和find_all()之间有什么区别?,python,python-3.x,web-scraping,beautifulsoup,python-requests,Python,Python 3.x,Web Scraping,Beautifulsoup,Python Requests,我在做网页抓取,但我在find()和find_all()中卡住了/弄糊涂了 例如在何处使用find_all,在何处使用user find() 此外,在哪里可以使用这种方法,如在for loop或ul li列表中 以下是我尝试的代码 从bs4导入BeautifulSoup 导入请求 URL=”https://www.flipkart.com/offers-list/latest-launches?screen=dynamic&pk=themeViews%3DAug19-最新发布的手机%3AttD
从bs4导入BeautifulSoup
导入请求
URL=”https://www.flipkart.com/offers-list/latest-launches?screen=dynamic&pk=themeViews%3DAug19-最新发布的手机%3AttDealCard~widgetType%3DdealCard~contentType%3Dneo&wid=7.dealCard.OMU_5&otracker=hp_OMU_最新发布的手机%2b最新发布的手机%5&otracker1=hp_OMU___白名单的手机%2Merchandising_最新发布的手机%2b最新发布的手机%2b最新发布的手机\u wc_视图-all_5“
source=requests.get(URL)
soup=BeautifulSoup(source.content'html.parser')
divs=soup.find_all('div',class='MDGhAp')
name=divs.find_all('a'))
全名=名称。查找所有('div',class='iUmrbN')。文本
打印(全名)
并且得到了这样的错误
File "C:/Users/ASUS/Desktop/utube/sunil.py", line 9, in <module>
names = divs.find_all('a')
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38-32\lib\site-packages\bs4\element.py", line 1601, in __getattr__
raise AttributeError(
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
文件“C:/Users/ASUS/Desktop/utube/sunil.py”,第9行,在
name=divs.find_all('a'))
文件“C:\Users\ASUS\AppData\Local\Programs\Python\Python38-32\lib\site packages\bs4\element.py”,第1601行,位于__
提高属性错误(
AttributeError:ResultSet对象没有“find_all”属性。您可能将项目列表视为单个项目。您是否在打算调用find()时调用了find_all()?
因此,任何人都可以解释我应该在哪里使用查找和查找所有?这个例子可能更清楚:
from bs4 import BeautifulSoup
import re
html = """
<ul>
<li>First</li>
<li>Second</li>
<li>Third</li>
</ul>
"""
soup = BeautifulSoup(html,'html.parser')
for n in soup.find('li'):
# It Give you one element
print(n)
for n in soup.find_all('li'):
# It Give you all elements
print(n)
从bs4导入美化组
进口稀土
html=”“”
- 首先
- 第二
- 第三
"""
soup=BeautifulSoup(html,'html.parser')
对于汤中的n。查找('li'):
#它给你一个元素
打印(n)
对于汤中的n,查找所有('li'):
#它给你所有的元素
打印(n)
结果:
First
<li>First</li>
<li>Second</li>
<li>Third</li>
首先
首先
第二
第三
有关更多信息,请阅读此find()-它仅在页面中找到搜索元素时返回结果。返回类型将为
find_all()-它返回所有匹配项(即扫描整个文档并返回所有结果),返回类型为
输出:
<class 'bs4.element.Tag'> <h3><a href="https://stackoverflow.com">current community</a>
</h3>
<class 'bs4.element.ResultSet'> [<h3><a href="https://stackoverflow.com">current community</a>
</h3>, <h3>
your communities </h3>, <h3><a href="https://stackexchange.com/sites">more stack exchange communities</a>
</h3>, <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Questions are everywhere, answers are on Stack Overflow</h3>, <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Learn and grow with Stack Overflow</h3>, <h3 class="mx-auto w90 wmx12 p-ff-roboto-slab-bold fs-headline2 mb24 lg:ta-center">Looking for a job?</h3>]
Iterating the Resultset
0 <h3><a href="https://stackoverflow.com">current community</a>
</h3>
1 <h3>
your communities </h3>
2 <h3><a href="https://stackexchange.com/sites">more stack exchange communities</a>
</h3>
3 <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Questions are everywhere, answers are on Stack Overflow</h3>
4 <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Learn and grow with Stack Overflow</h3>
5 <h3 class="mx-auto w90 wmx12 p-ff-roboto-slab-bold fs-headline2 mb24 lg:ta-center">Looking for a job?</h3>
[
,
你们的社区,
,问题无处不在,答案就在堆栈溢出上,在堆栈溢出中学习和成长,寻找工作?]
迭代结果集
0
1.
你们的社区
2.
3个问题无处不在,答案在堆栈溢出上
4通过堆栈溢出学习和成长
5.找工作?
从Beautiful Soup文档中找到了这一点。如果您正在从a
或span
中删除更具体的内容,请尝试查找find
,如果您正在从a
或span
中删除更一般的内容,请尝试查找所有内容。
soup.find_all('a'))
# [,
# ,
# ]
soup.find(id=“link3”)
#
希望这有帮助
<class 'bs4.element.Tag'> <h3><a href="https://stackoverflow.com">current community</a>
</h3>
<class 'bs4.element.ResultSet'> [<h3><a href="https://stackoverflow.com">current community</a>
</h3>, <h3>
your communities </h3>, <h3><a href="https://stackexchange.com/sites">more stack exchange communities</a>
</h3>, <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Questions are everywhere, answers are on Stack Overflow</h3>, <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Learn and grow with Stack Overflow</h3>, <h3 class="mx-auto w90 wmx12 p-ff-roboto-slab-bold fs-headline2 mb24 lg:ta-center">Looking for a job?</h3>]
Iterating the Resultset
0 <h3><a href="https://stackoverflow.com">current community</a>
</h3>
1 <h3>
your communities </h3>
2 <h3><a href="https://stackexchange.com/sites">more stack exchange communities</a>
</h3>
3 <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Questions are everywhere, answers are on Stack Overflow</h3>
4 <h3 class="w90 mx-auto ta-center p-ff-roboto-slab-bold fs-headline2 mb24">Learn and grow with Stack Overflow</h3>
5 <h3 class="mx-auto w90 wmx12 p-ff-roboto-slab-bold fs-headline2 mb24 lg:ta-center">Looking for a job?</h3>
soup.find_all('a')
# [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
# <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
# <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]
soup.find(id="link3")
# <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>