如何以多线程方式将selenium与python结合使用

如何以多线程方式将selenium与python结合使用,python,selenium,google-chrome,selenium-webdriver,Python,Selenium,Google Chrome,Selenium Webdriver,嘿,伙计们,我正在尝试使用线程与selenium一起工作。我的代码是:- import threading as th import time import base64 import mysql.connector as mysql import requests from bs4 import BeautifulSoup from seleniumwire import webdriver from selenium.webdriver.chrome.options import Opti

嘿,伙计们,我正在尝试使用线程与selenium一起工作。我的代码是:-

import threading  as th
import time
import base64
import mysql.connector as mysql
import requests
from bs4 import BeautifulSoup
from seleniumwire import webdriver
from selenium.webdriver.chrome.options import Options
from functions import *

options = Options()
prefs = {'profile.default_content_setting_values': {'images': 2,'popups': 2, 'geolocation': 2, 
                            'notifications': 2, 'auto_select_certificate': 2, 'fullscreen': 2, 
                            'mouselock': 2, 'mixed_script': 2, 'media_stream': 2, 
                            'media_stream_mic': 2, 'media_stream_camera': 2, 'protocol_handlers': 2, 
                            'ppapi_broker': 2, 'automatic_downloads': 2, 'midi_sysex': 2, 
                            'push_messaging': 2, 'ssl_cert_decisions': 2, 'metro_switch_to_desktop': 2, 
                            'protected_media_identifier': 2, 'app_banner': 2, 'site_engagement': 2, 
                            'durable_storage': 2}}
print('Crawling process started')
options.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(executable_path='chromedriver.exe', options=options)
driver.set_page_load_timeout(50000)
urls='https://google.com https://youtube.com'
def getinf(url_):
    driver.get(url_)
    soup=BeautifulSoup(driver.page_source, 'html5lib')
    print(soup.select('title'))
for url in urls.split():
    t=th.Thread(target=getinf, args=(url,))
    t.start()

当脚本运行时,选项卡不会像我预期的那样(从线程中)立即打开,而是一个接一个地完成该过程,并且只打印最后一个url()的标题。当我尝试多重处理时,程序会崩溃很多次。我正在制作一个网络爬虫,一些网站(比如twitter)需要JavaScript来显示内容,所以我不能使用请求或urllib。解决这个问题的办法是什么。欢迎任何其他库建议。

尝试在线程代码中创建chromedriver。否则,您只有一个驱动程序,并且您正在更改同一驱动程序的url。相反,尝试为每个线程创建单独的chromedriver

注意:我没有尝试代码,只是建议

import threading  as th
import time
import base64
import mysql.connector as mysql
import requests
from bs4 import BeautifulSoup
from seleniumwire import webdriver
from selenium.webdriver.chrome.options import Options
from functions import *

options = Options()
prefs = {'profile.default_content_setting_values': {'images': 2,'popups': 2, 'geolocation': 2, 
                            'notifications': 2, 'auto_select_certificate': 2, 'fullscreen': 2, 
                            'mouselock': 2, 'mixed_script': 2, 'media_stream': 2, 
                            'media_stream_mic': 2, 'media_stream_camera': 2, 'protocol_handlers': 2, 
                            'ppapi_broker': 2, 'automatic_downloads': 2, 'midi_sysex': 2, 
                            'push_messaging': 2, 'ssl_cert_decisions': 2, 'metro_switch_to_desktop': 2, 
                            'protected_media_identifier': 2, 'app_banner': 2, 'site_engagement': 2, 
                            'durable_storage': 2}}
print('Crawling process started')
options.add_experimental_option('prefs', prefs)
urls='https://google.com https://youtube.com'
def getinf(url_):
    driver = webdriver.Chrome(executable_path='chromedriver.exe', options=options)
    driver.set_page_load_timeout(50000)
    driver.get(url_)
    soup=BeautifulSoup(driver.page_source, 'html5lib')
    print(soup.select('title'))
for url in urls.split():
    t=th.Thread(target=getinf, args=(url,))
    t.start()

Youtube和Twiiter都有Python API。selenium驱动程序是我不想单独为Youtube、twitter等开发一个软件来提取数据。我想要一个完整的。我该怎么做呢?如果必须是python,那就有Pypetteer,否则木偶演员是更好的选择。