pythonwebscraping:我有一个带有选择列表的网站。以及如何提取这些列表中的文本
链接如下: 我需要了解这个职业及其相应的专业。 但我的代码只适用于拉职业pythonwebscraping:我有一个带有选择列表的网站。以及如何提取这些列表中的文本,python,selenium,web-scraping,Python,Selenium,Web Scraping,链接如下: 我需要了解这个职业及其相应的专业。 但我的代码只适用于拉职业 import requests, bs4 r = requests.get('https://www.doximity.com/sign_ups/9e016f85-d589-4cdf-8240-09c356d4434f/edit?sign_up[user_attributes][firstname]=Jian&sign_up[user_attributes][lastname]=Cui') soup = bs4
import requests, bs4
r = requests.get('https://www.doximity.com/sign_ups/9e016f85-d589-4cdf-8240-09c356d4434f/edit?sign_up[user_attributes][firstname]=Jian&sign_up[user_attributes][lastname]=Cui')
soup = bs4.BeautifulSoup(r.text, 'lxml')
spec = soup.find_all('select')
for sub in spec:
print (sub.text)
请给我一些想法。检查下面的代码,如果有任何问题,请告诉我:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time
driver = webdriver.Chrome()
url = 'https://www.doximity.com/sign_ups/9e016f85-d589-4cdf-8240-09c356d4434f/edit?sign_up[user_attributes][firstname]=Jian&sign_up[user_attributes][lastname]=Cui'
driver.get(url)
spec = driver.find_element_by_id("sign_up_user_attributes_credential_id")
for sub in spec.find_elements_by_xpath('./option | ./optgroup/option'):
if sub.get_attribute('value') != '':
print(sub.text)
selected_spec = Select(driver.find_element_by_id("sign_up_user_attributes_credential_id"))
selected_spec.select_by_visible_text(sub.text)
time.sleep(0.5)
occup = driver.find_element_by_xpath('//select[@id="sign_up_user_attributes_user_professional_detail_attributes_specialty_id"]')
for oc in occup.find_elements_by_xpath('./option'):
if oc.text != '' and oc.get_attribute('value') != '':
print(oc.text)
你需要硒来做这个。BeautifulSoup不是为动态网站交互而设计的,这里就是这样,即你必须选择一个职业才能获得它的专长。明白了。我要试一试