Python selenium从下拉列表中选择第一个元素

Python selenium从下拉列表中选择第一个元素,python,selenium,Python,Selenium,下面的代码单击包含excel工作表的页面上的“文件”菜单 from selenium import webdriver driver = webdriver.PhantomJS() driver.set_window_size(1120, 550) driver.get(r"foo%20Data%20235.xlsx&DefaultItemOpen=3") # dummy link driver.find_element_by_css_selector('#jewel-button-mi

下面的代码单击包含excel工作表的页面上的“文件”菜单

from selenium import webdriver
driver = webdriver.PhantomJS()
driver.set_window_size(1120, 550)
driver.get(r"foo%20Data%20235.xlsx&DefaultItemOpen=3") # dummy link
driver.find_element_by_css_selector('#jewel-button-middle > span').click() # responsible for clicking the file menu
driver.quit()
我不知道如何单击第一个选项,即从弹出菜单下载快照选项。我无法检查弹出菜单或下拉菜单的元素。我想下载xlsx文件


使用
FireFox
更容易检查这些元素(关闭下拉列表),打开开发者工具,在从
FireBug
工具栏(图中用红色方框标记)中选择选项后,用鼠标巡洋舰站在元素上即可

至于这个问题,您正在寻找的定位器是
('[id*=“downloadsnashot”]>span')


我观察到,直到excel完全加载,文件菜单才显示任何选项。因此,请等待excel手册加载

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from time import sleep
from selenium.webdriver.common.action_chains import ActionChains

browser = webdriver.PhantomJS()
browser.maximize_window()
browser.get('http://www.cbe.org.eg/en/EconomicResearch/Publications/_layouts/xlviewer.aspx?id=/MonthlyStatisticaclBulletinDL/External%20Sector%20Data%20235.xlsx&DefaultItemOpen=1#')

wait = WebDriverWait(browser, 10)
element = wait.until(EC.visibility_of_element_located((By.XPATH, "//td[@data-range='B59']")))
element = wait.until(EC.element_to_be_clickable((By.ID, 'jewel-button-middle')))
element.click()
eleDownload = wait.until(EC.element_to_be_clickable((By.XPATH,"//span[text()='Download a Snapshot']")))
eleDownload.click()
sleep(5)
browser.quit()

按id/标记查找元素,检查循环中的选项,选择所需的选项,然后单击。

方法是使用
PhantomJS
加载页面,等待加载工作簿的内容,获取下载文件处理程序端点请求的所有必要参数,我们可以使用这些参数

完整的工作解决方案:

import json

import requests
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WORKBOOK_TYPE = "PublishedItemsSnapshot"

driver = webdriver.PhantomJS()
driver.maximize_window()
driver.get('http://www.cbe.org.eg/en/EconomicResearch/Publications/_layouts/xlviewer.aspx?id=/MonthlyStatisticaclBulletinDL/External%20Sector%20Data%20235.xlsx&DefaultItemOpen=1#')

wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_element_located((By.ID, "ctl00_PlaceHolderMain_m_excelWebRenderer_ewaCtl_rowHeadersDiv")))

# get workbook uri
hidden_input = wait.until(EC.presence_of_element_located((By.ID, "ctl00_PlaceHolderMain_m_excelWebRenderer_ewaCtl_m_workbookContextJson")))
workbook_uri = json.loads(hidden_input.get_attribute('value'))['EncryptedWorkbookUri']

# get session id
session_id = driver.find_element_by_id("ctl00_PlaceHolderMain_m_excelWebRenderer_ewaCtl_m_workbookId").get_attribute("value")

# get workbook filename
workbook_filename = driver.find_element_by_xpath("//h2[contains(@class, 's4-mini-header')]/span[contains(., '.xlsx')]").text

driver.close()

print("Downloading workbook '%s'..." % workbook_filename)
response = requests.get("http://www.cbe.org.eg/en/EconomicResearch/Publications/_layouts/XlFileHandler.aspx", params={
    'id': workbook_uri,
    'sessionId': session_id,
    'workbookFileName': workbook_filename,
    'workbookType': WORKBOOK_TYPE
})
with open(workbook_filename, 'wb') as f:
    for chunk in response.iter_content(chunk_size=1024):
        if chunk: # filter out keep-alive new chunks
            f.write(chunk)

如果可能的话,你能分享一下实际的链接吗?或者,在屏幕截图上打开此页面并将HTML的相关部分(带有“下载快照”链接的菜单)添加到问题后,是否可以转储驱动程序.page\u source?谢谢。@alecxe嗨。。这里是链接说明,下载需要1-2秒,您可能需要延迟
驱动程序。quit()
驱动程序单击“文件”菜单后,我尝试拍摄屏幕截图。但它无法在屏幕截图上显示下拉列表。所以它应该等到元素在10秒内可见,对吗?@AvinashRaj。您可以阅读更多关于等待的内容。@AvinashRaj我编辑了我的答案,如果您的internet速度较慢,您需要等待页面加载,然后单击
文件
选项卡将超时时间从
10秒增加到20秒。我注意到加载excel.oki需要时间,酷。。但是我想通过webdriver.PhantomJS来实现这一点,只需将行
webdriver.Firefox()
替换为
webdriver.PhantomJS()
rest是相同的。您的代码似乎正在使用Firefox web驱动程序,但在phantom上,它在第13行抛出异常。。如何下载文件?请注意,我使用的是phantomjs..使用以下答案在Firefox浏览器中自动保存文件。在PhantomJS中,这似乎是一个问题。更多详情&Hi@alecxe。谢谢你的回答。有没有办法只在最后一步使用phantomjs,即在使用phantomjs单击下载快照按钮时,它必须获得下载url并将url交给请求库?@AvinashRaj很高兴它为您工作。我相信PhantomJS不能自动保存下载。下载链接是动态构建的,不确定是否有一种简单的方法来获取它,而不是像我们目前正在做的那样手动构建。。
import json

import requests
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WORKBOOK_TYPE = "PublishedItemsSnapshot"

driver = webdriver.PhantomJS()
driver.maximize_window()
driver.get('http://www.cbe.org.eg/en/EconomicResearch/Publications/_layouts/xlviewer.aspx?id=/MonthlyStatisticaclBulletinDL/External%20Sector%20Data%20235.xlsx&DefaultItemOpen=1#')

wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_element_located((By.ID, "ctl00_PlaceHolderMain_m_excelWebRenderer_ewaCtl_rowHeadersDiv")))

# get workbook uri
hidden_input = wait.until(EC.presence_of_element_located((By.ID, "ctl00_PlaceHolderMain_m_excelWebRenderer_ewaCtl_m_workbookContextJson")))
workbook_uri = json.loads(hidden_input.get_attribute('value'))['EncryptedWorkbookUri']

# get session id
session_id = driver.find_element_by_id("ctl00_PlaceHolderMain_m_excelWebRenderer_ewaCtl_m_workbookId").get_attribute("value")

# get workbook filename
workbook_filename = driver.find_element_by_xpath("//h2[contains(@class, 's4-mini-header')]/span[contains(., '.xlsx')]").text

driver.close()

print("Downloading workbook '%s'..." % workbook_filename)
response = requests.get("http://www.cbe.org.eg/en/EconomicResearch/Publications/_layouts/XlFileHandler.aspx", params={
    'id': workbook_uri,
    'sessionId': session_id,
    'workbookFileName': workbook_filename,
    'workbookType': WORKBOOK_TYPE
})
with open(workbook_filename, 'wb') as f:
    for chunk in response.iter_content(chunk_size=1024):
        if chunk: # filter out keep-alive new chunks
            f.write(chunk)