Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/elixir/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Pandas 塞恩斯伯里的靓汤一无所获';s_Pandas_Beautifulsoup - Fatal编程技术网

Pandas 塞恩斯伯里的靓汤一无所获';s

Pandas 塞恩斯伯里的靓汤一无所获';s,pandas,beautifulsoup,Pandas,Beautifulsoup,类似于,但适用于不同的站点: 我试着跑步: url='https://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[keyword]=banana' # configure driver chrome_options = webdriver.ChromeOptions() chrome_options.add_argument("--headless") chrome_driver = os.getcwd() +

类似于,但适用于不同的站点:

我试着跑步:

url='https://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[keyword]=banana'

# configure driver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_driver = os.getcwd() + "\\chromedriver.exe"  # IF NOT IN SAME FOLDER CHANGE THIS PATH
driver = webdriver.Chrome(options=chrome_options, executable_path=chrome_driver)
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

container_tag1='pt__content'
containers = page_soup.findAll("div",{"class":container_tag1})
# print(containers)
print(len(containers))

无济于事

我尝试不使用硒,但也失败了


有什么建议吗?

您必须等待页面完全呈现,然后才能将HTML传递给
BeautifulSoup
。一个选项是使用内置模块中的方法

从时间导入睡眠
从selenium导入webdriver
从bs4导入BeautifulSoup
URL=”https://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[关键字]=香蕉“
driver=webdriver.Chrome(r“c:\path\to\chromedriver.exe”)
获取驱动程序(URL)

睡眠(5)#我还加了15秒的睡眠,但对我来说仍然不起作用
from time import sleep
from selenium import webdriver
from bs4 import BeautifulSoup

URL = "https://www.sainsburys.co.uk/gol-ui/SearchDisplayView?filters[keyword]=banana"

driver = webdriver.Chrome(r"c:\path\to\chromedriver.exe")
driver.get(URL)
sleep(5)  # <-- Wait for the page to fully render

soup = BeautifulSoup(driver.page_source, "html.parser")
print(soup.find_all("div", {"class": "pt__content"}))