Python URL已超过最大重试次数
因此,我希望遍历一个URL数组并打开不同的URL,以便使用Selenium进行web抓取。问题是,只要我点击第二个浏览器.get(url),我就会得到一个“url超过最大重试次数”和“无法建立连接,因为目标计算机主动拒绝它” 编辑:添加了代码的其余部分,尽管它只是一些漂亮的东西Python URL已超过最大重试次数,python,selenium,selenium-webdriver,web-scraping,beautifulsoup,Python,Selenium,Selenium Webdriver,Web Scraping,Beautifulsoup,因此,我希望遍历一个URL数组并打开不同的URL,以便使用Selenium进行web抓取。问题是,只要我点击第二个浏览器.get(url),我就会得到一个“url超过最大重试次数”和“无法建立连接,因为目标计算机主动拒绝它” 编辑:添加了代码的其余部分,尽管它只是一些漂亮的东西 from bs4 import BeautifulSoup import time from selenium import webdriver from selenium.webdriver import Chrome
from bs4 import BeautifulSoup
import time
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
import json
chrome_options = Options()
chromedriver = webdriver.Chrome(executable_path='C:/Users/andre/Downloads/chromedriver_win32/chromedriver.exe', options=chrome_options)
urlArr = ['https://link1', 'https://link2', '...']
for url in urlArr:
with chromedriver as browser:
browser.get(url)
time.sleep(5)
# Click a button
chromedriver.find_elements_by_tag_name('a')[7].click()
chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(2)
for i in range (0, 2):
chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
html = browser.page_source
page_soup = BeautifulSoup(html, 'html.parser')
boxes = page_soup.find("div", {"class": "rpBJOHq2PR60pnwJlUyP0"})
videos = page_soup.findAll("video", {"class": "_1EQJpXY7ExS04odI1YBBlj"})
这里的其他帖子说,当你一次使用太多的页面,服务器将我拒之门外时,就会发生这种情况,但这不是我的问题。每当我多次调用browser.get(url)时,就会出现上述错误
发生什么事了?谢谢。解决了这个问题。您必须重新创建webdriver
from bs4 import BeautifulSoup
import time
from selenium import webdriver
from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options
import json
urlArr = ['https://link1', 'https://link2', '...']
for url in urlArr:
chrome_options = Options()
chromedriver = webdriver.Chrome(executable_path='C:/Users/andre/Downloads/chromedriver_win32/chromedriver.exe', options=chrome_options)
with chromedriver as browser:
browser.get(url)
time.sleep(5)
# Click a button
chromedriver.find_elements_by_tag_name('a')[7].click()
chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(2)
for i in range (0, 2):
chromedriver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
html = browser.page_source
page_soup = BeautifulSoup(html, 'html.parser')
boxes = page_soup.find("div", {"class": "rpBJOHq2PR60pnwJlUyP0"})
videos = page_soup.findAll("video", {"class": "_1EQJpXY7ExS04odI1YBBlj"})
你什么都没解决。现在,您正在为每个url打开一个新浏览器(而不是关闭它们)