Python 无法通过URL识别BeautifulSoup中的网页_Python_Selenium_Proxy Server

Python 无法通过URL识别BeautifulSoup中的网页

python selenium

Python 无法通过URL识别BeautifulSoup中的网页,python,selenium,proxy-server,Python,Selenium,Proxy Server,我正在使用Python和Selenium尝试从某个搜索页面的结果页面中删除所有链接。无论我在上一个屏幕中搜索什么，结果页面上任何搜索的URL都是：“ 如果我使用Selenium进行自动搜索，然后尝试将此URL读入BeautifulSoup，我会得到HTTPError:HTTP Error 404:Not Found 这是我的密码： from selenium import webdriver from selenium.webdriver.support.ui import Select fr

我正在使用Python和Selenium尝试从某个搜索页面的结果页面中删除所有链接。无论我在上一个屏幕中搜索什么，结果页面上任何搜索的URL都是：“ 如果我使用Selenium进行自动搜索，然后尝试将此URL读入BeautifulSoup，我会得到HTTPError:HTTP Error 404:Not Found

这是我的密码：

from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from urllib.request import urlopen
from bs4 import BeautifulSoup
import csv


# create a new Firefox session
driver = webdriver.Firefox()
# wait 3 seconds for the page to load
driver.implicitly_wait(3)

# navigate to ChemIDPlus Website
driver.get("https://chem.nlm.nih.gov/chemidplus/")
#implicit wait 10 seconds for drop-down menu to load
driver.implicitly_wait(10)

#open drop-down menu QV7 ("Route:")
select=Select(driver.find_element_by_name("QV7"))
#select "inhalation" in QV7
select.select_by_visible_text("inhalation")
#identify submit button

search=“/html/body/div[2]/div/div[2]/div/div[2]/form/div[1]/div/span/button[1]”

我怀疑这与proxyserver有关，Python没有收到识别网站的必要信息，但我不确定如何解决这个问题。

提前谢谢

我使用Selenium来识别新的URL，作为识别正确搜索页面的一种变通方法： url1=driver.current\u url 接下来，我使用请求获取内容并将其提供给beautifulsoup。总之，我补充说：

#Added to the top of the script
import requests
...
#identify the current search page with Selenium
url1=driver.current_url
#scrape the content of the results page
r=requests.get(url)
soup=BeautifulSoup(r.content, 'html.parser')
...

我使用Selenium来识别新URL，作为识别正确搜索页面的一种解决方法： url1=driver.current\u url 接下来，我使用请求获取内容并将其提供给beautifulsoup。总之，我补充说：

#Added to the top of the script
import requests
...
#identify the current search page with Selenium
url1=driver.current_url
#scrape the content of the results page
r=requests.get(url)
soup=BeautifulSoup(r.content, 'html.parser')
...