Python Can'；使用beautifulsoup进行刮取时无法获取所有数据_Python_Selenium_Web Scraping_Beautifulsoup

Python Can'；使用beautifulsoup进行刮取时无法获取所有数据

python selenium web-scraping

Python Can'；使用beautifulsoup进行刮取时无法获取所有数据,python,selenium,web-scraping,beautifulsoup,Python,Selenium,Web Scraping,Beautifulsoup,我正在练习使用selenium和beautifulsoup来清理我当地的电子商务网站。但是当我试着运行代码时，我只得到了一些值，而其余的我无法得到导入操作系统从selenium导入webdriver 从selenium.webdriver.common.by导入从selenium.webdriver.chrome.options导入选项从bs4导入BeautifulSoup 导入请求，时间选项=选项（） option.add_参数（“--disable infobar”） option

我正在练习使用selenium和beautifulsoup来清理我当地的电子商务网站。但是当我试着运行代码时，我只得到了一些值，而其余的我无法得到

导入操作系统
从selenium导入webdriver
从selenium.webdriver.common.by导入
从selenium.webdriver.chrome.options导入选项
从bs4导入BeautifulSoup
导入请求，时间
选项=选项（）
option.add_参数（“--disable infobar”）
option.add_参数（“--disable extensions”）
#阻止任何通知
选项。添加实验选项（“prefs”{
“配置文件。默认内容设置值。通知”：2
})
driver=webdriver.Chrome（options=option，executable\u path=os.path.abspath（'chromedriver'））
司机，上车https://www.tokopedia.com/')
#单击“动作图”类别
驱动程序。通过_css_选择器（'.css-15j6m2y>div:nth child（5）>div:nth child（1）>div:nth child（1）>a:nth child（1）'）查找_元素。单击（）
page=driver.page\u源
driver.quit（）
soup=BeautifulSoup（页面“html.parser”）
container=soup.find_all（'div'，attrs={'class'：'css-bk6tzz e1nlzfl3'}）
对于计数，枚举中的操作（容器，1）：
name=action.find（'span'，class='css-1bjwylw'）.text
price=action.find（'span'，class='css-o5uqvq'）。text
打印（'----'）
打印（计数）
打印（f'nama:{name}'）
打印（f'harga:{price}'）

我没有收到任何错误，但不知何故，脚本结束时没有获得所有数据。我的方法有错误吗

这是我在StackOverflow的第一个问题，我做得对吗？提前谢谢。

我已经发现了问题，谢谢PApostol询问课程名称

问题是，在用户向下滚动页面之前，该页面不会加载所有产品。它只显示前10项。在我尝试向下滚动之后，我得到了所有的值

添加此行，滚动至页面底部

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

我的代码现在如下所示：

导入操作系统
从selenium导入webdriver
从selenium.webdriver.common.by导入
从selenium.webdriver.chrome.options导入选项
从bs4导入BeautifulSoup
导入请求，时间
选项=选项（）
option.add_参数（“--disable infobar”）
option.add_参数（“--disable extensions”）
#阻止任何通知
选项。添加实验选项（“prefs”{
“配置文件。默认内容设置值。通知”：2
})
driver=webdriver.Chrome（options=option，executable\u path=os.path.abspath（'chromedriver'））
司机，上车https://www.tokopedia.com/')
#单击“动作图”类别
驱动程序。通过_css_选择器（'.css-15j6m2y>div:nth child（5）>div:nth child（1）>div:nth child（1）>a:nth child（1）'）查找_元素。单击（）
#滚动至页面底部。
执行脚本（“window.scrollTo（0，document.body.scrollHeight）；”）
时间。睡眠（10）
page=driver.page\u源
soup=BeautifulSoup（页面“html.parser”）
container=soup.find_all（'div'，attrs={'class'：'css-bk6tzz e1nlzfl3'}）
对于计数，枚举中的操作（容器，1）：
name=action.find（'span'，class='css-1bjwylw'）.text
price=action.find（'span'，class='css-o5uqvq'）。text
打印（'----'）
打印（计数）
打印（f'nama:{name}'）
打印（f'harga:{price}'）

我将

driver.quit（）

更改为

time.sleep（20）

，因为获取所有产品都需要时间。

你确定你要抓取的网站的页面源中存在class

css-bk6tzz e1nlzfl3

吗？很可能是

容器

是空的，你能打印里面有什么吗？@PApostol我能。因为我在inspect元素中搜索它。容器不是空的，你可以看看这个。当我尝试重新确认类名时，我已经发现了问题。因为在用户滚动到底部之前，页面未完成加载。它只显示前10项。谢谢你的回答。