Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/selenium/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何旋转Selenium webrowser IP地址_Python_Selenium_Selenium Webdriver_Proxy_Webdriver - Fatal编程技术网

Python 如何旋转Selenium webrowser IP地址

Python 如何旋转Selenium webrowser IP地址,python,selenium,selenium-webdriver,proxy,webdriver,Python,Selenium,Selenium Webdriver,Proxy,Webdriver,我有一个Python脚本,每30秒访问一个网站,每次我都需要一个不同的IP地址 最佳/最具时效性的解决方案是什么 在线抓取免费代理?您知道一个从许多源收集代理的python脚本吗 每次使用Tor浏览器都有不同的IP(我在aws ec2实例上使用selenium,你们知道如何在Ubuntu服务器上使用Tor浏览器的教程吗?) 其他方法 要收集和使用不同的代理,一个可靠的解决方案是使用新的活动代理向网站发出代理请求,该代理将使用以下解决方案在中列出: 代码块: from selenium imp

我有一个Python脚本,每30秒访问一个网站,每次我都需要一个不同的IP地址

最佳/最具时效性的解决方案是什么

  • 在线抓取免费代理?您知道一个从许多源收集代理的python脚本吗

  • 每次使用Tor浏览器都有不同的IP(我在aws ec2实例上使用selenium,你们知道如何在Ubuntu服务器上使用Tor浏览器的教程吗?)

  • 其他方法


要收集和使用不同的代理,一个可靠的解决方案是使用新的活动代理向网站发出代理请求,该代理将使用以下解决方案在中列出:

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.common.exceptions import TimeoutException
    
    options = webdriver.ChromeOptions()
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get("https://sslproxies.org/")
    driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//th[contains(., 'IP Address')]"))))
    ips = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 1]")))]
    ports = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 2]")))]
    driver.quit()
    proxies = []
    for i in range(0, len(ips)):
        proxies.append(ips[i]+':'+ports[i])
    print(proxies)
    for i in range(0, len(proxies)):
        try:
            print("Proxy selected: {}".format(proxies[i]))
            options = webdriver.ChromeOptions()
            options.add_argument('--proxy-server={}'.format(proxies[i]))
            driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
            driver.get("https://www.whatismyip.com/proxy-check/?iref=home")
            if "Proxy Type" in WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.card-text"))):
                break
        except Exception:
            driver.quit()
    print("Proxy Invoked")
    
  • 控制台输出:

    ['190.7.158.58:39871', '175.139.179.65:54980', '186.225.45.146:45672', '185.41.99.100:41258', '43.230.157.153:52986', '182.23.32.66:30898', '36.37.160.253:31450', '93.170.15.214:56305', '36.67.223.67:43628', '78.26.172.44:52490', '36.83.135.183:3128', '34.74.180.144:3128', '206.189.122.177:3128', '103.194.192.42:55546', '70.102.86.204:8080', '117.254.216.97:23500', '171.100.221.137:8080', '125.166.176.153:8080', '185.146.112.24:8080', '35.237.104.97:3128']
    
    Proxy selected: 190.7.158.58:39871
    
    Proxy selected: 175.139.179.65:54980
    
    Proxy selected: 186.225.45.146:45672
    
    Proxy selected: 185.41.99.100:41258
    

要收集和使用不同的代理,一个可靠的解决方案是使用新的活动代理向网站发出代理请求,该代理将使用以下解决方案在中列出:

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.common.exceptions import TimeoutException
    
    options = webdriver.ChromeOptions()
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get("https://sslproxies.org/")
    driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//th[contains(., 'IP Address')]"))))
    ips = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 1]")))]
    ports = [my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-striped table-bordered dataTable']//tbody//tr[@role='row']/td[position() = 2]")))]
    driver.quit()
    proxies = []
    for i in range(0, len(ips)):
        proxies.append(ips[i]+':'+ports[i])
    print(proxies)
    for i in range(0, len(proxies)):
        try:
            print("Proxy selected: {}".format(proxies[i]))
            options = webdriver.ChromeOptions()
            options.add_argument('--proxy-server={}'.format(proxies[i]))
            driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
            driver.get("https://www.whatismyip.com/proxy-check/?iref=home")
            if "Proxy Type" in WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.card-text"))):
                break
        except Exception:
            driver.quit()
    print("Proxy Invoked")
    
  • 控制台输出:

    ['190.7.158.58:39871', '175.139.179.65:54980', '186.225.45.146:45672', '185.41.99.100:41258', '43.230.157.153:52986', '182.23.32.66:30898', '36.37.160.253:31450', '93.170.15.214:56305', '36.67.223.67:43628', '78.26.172.44:52490', '36.83.135.183:3128', '34.74.180.144:3128', '206.189.122.177:3128', '103.194.192.42:55546', '70.102.86.204:8080', '117.254.216.97:23500', '171.100.221.137:8080', '125.166.176.153:8080', '185.146.112.24:8080', '35.237.104.97:3128']
    
    Proxy selected: 190.7.158.58:39871
    
    Proxy selected: 175.139.179.65:54980
    
    Proxy selected: 186.225.45.146:45672
    
    Proxy selected: 185.41.99.100:41258
    

请注意,一些网站会屏蔽已知的Tor exit node IP。我们曾经尝试使用Tor来达到这个目的,但发现一些使用CloudFlare等服务的网站阻止了我们。但它对一些网站有效。另外,如果内存可用,我们没有使用Tor浏览器,我们在Chromedriver上使用了Chrome的Tor插件。只是提醒一下,一些网站会屏蔽已知的Tor出口节点IP。我们曾经尝试使用Tor来达到这个目的,但发现一些使用CloudFlare等服务的网站阻止了我们。但它对一些网站有效。另外,如果内存可用,我们没有使用Tor浏览器,我们在Chromedriver上使用了Chrome浏览器的Tor插件。谢谢您的时间和回复。对我很有用!我开始研究一个名为ProxyBroker的开源代码。它有一个“服务”功能,用于tun本地代理服务器,该服务器将传入的请求分发给外部代理。下面是代码,我认为每次打开selenium浏览器时,在池中获取一个不同的工作代理会更快。。但我不知道如何以稳健的方式做到这一点。。(我在使用firefox顺便说一句)谢谢你的时间和回复。对我很有用!我开始研究一个名为ProxyBroker的开源代码。它有一个“服务”功能,用于tun本地代理服务器,该服务器将传入的请求分发给外部代理。下面是代码,我认为每次打开selenium浏览器时,在池中获取一个不同的工作代理会更快。。但我不知道如何以稳健的方式做到这一点。。(我正在使用firefox顺便说一句)Tim