有没有办法让这个python selenium代码在headless模式下工作?

有没有办法让这个python selenium代码在headless模式下工作?,python,selenium,headless,Python,Selenium,Headless,所以我在前面问了这个问题()并成功地做到了这一点。我终于意识到代码不起作用,因为它处于无头模式 在我之前的帖子中,我还提到我会尝试使用请求来获取文件,但在这种情况下,似乎没有一个链接指向csv文件 代码基本上在这里,单击“所有年份”按钮,然后单击“下载历史数据”按钮。selenium会在单击后尝试保存文件 但就像我说的,它只在我处于正常模式时下载文件,在无头模式下似乎不起作用。这有什么原因吗?有没有办法让它在无头模式下工作?我一直在四处寻找,但找不到答案 从selenium导入webdrive

所以我在前面问了这个问题()并成功地做到了这一点。我终于意识到代码不起作用,因为它处于无头模式

在我之前的帖子中,我还提到我会尝试使用请求来获取文件,但在这种情况下,似乎没有一个链接指向
csv
文件

代码基本上在这里,单击“所有年份”按钮,然后单击“下载历史数据”按钮。selenium会在单击后尝试保存文件

但就像我说的,它只在我处于正常模式时下载文件,在无头模式下似乎不起作用。这有什么原因吗?有没有办法让它在无头模式下工作?我一直在四处寻找,但找不到答案


从selenium导入webdriver
从selenium.webdriver.support将预期的_条件导入为EC
从selenium.webdriver.chrome.options导入选项
开始时间=time.time()
选项=选项()
#选项。添加参数(“--headless”)
options.add_参数(“--disable gpu”)
options.add_参数(“--disable extensions”)
选项。添加实验选项(“prefs”{
“download.default_目录”:r“'/home/Documents/testing/macrotrends',
“下载。提示下载”:False,
“下载目录\升级”:True,
“safebrowsing.enabled”:False
})
driver=webdriver.Chrome(可执行文件路径=r'/home/chromedriver/chromedriver',options=options)
司机,上车https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
时间。睡眠(5)
iframe=driver。通过xpath(“iframe[@id='chart\u iframe']”查找元素
驱动程序切换到帧(iframe)
xpath=“//a[text()=‘所有年份’”
通过xpath(xpath)查找元素
xpath=“//按钮[@id='dataDownload']”
通过xpath(xpath)查找元素
时间。睡眠(10)
驱动程序关闭()
打印(“--%s秒--”%(time.time()-start\u time))

在无头模式下,默认情况下禁用下载。您可以通过执行开发人员工具命令来允许它们,如下所示:

from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options

options = Options()
options.headless = True 
driver = Chrome(options=options)
params = {'behavior': 'allow', 'downloadPath': '/path/for/download'}
driver.execute_cdp_cmd('Page.setDownloadBehavior', params)
# downloads are now enabled for this driver instance

在无头模式下,默认情况下禁用下载。您可以通过执行开发人员工具命令来允许它们,如下所示:

from selenium.webdriver import Chrome
from selenium.webdriver.chrome.options import Options

options = Options()
options.headless = True 
driver = Chrome(options=options)
params = {'behavior': 'allow', 'downloadPath': '/path/for/download'}
driver.execute_cdp_cmd('Page.setDownloadBehavior', params)
# downloads are now enabled for this driver instance

您可以使用模块
pyvirtualdisplay
创建虚拟显示,该显示将由
Chrome
Firefox
(无
headless
)自动使用,并将隐藏窗口

铬:

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import time

from pyvirtualdisplay import Display

display = Display(visible=0, size=(1920,1080))
display.start()

start_time = time.time()

options = Options()

###options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--disable-extensions")

options.add_experimental_option("prefs", {
  "download.default_directory": "/home/Documents/testing/macrotrends", # without `r` and `' '`, only `" "`
  "download.prompt_for_download": False,
  "download.directory_upgrade": True,
  "safebrowsing.enabled": False
})

driver = webdriver.Chrome(executable_path=r'/home/chromedriver/chromedriver',options=options)
#driver = webdriver.Chrome(options=options) # I have chromedriver's folder in PATH so I don't have to use `executable_path`

driver.get('https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
print('[INFO] loaded', time.time() - start_time)
time.sleep(5)

iframe = driver.find_element_by_xpath("//iframe[@id='chart_iframe']")
driver.switch_to.frame(iframe)
print('[INFO] switched', time.time() - start_time)

xpath = "//a[text()='All Years']"
driver.find_element_by_xpath(xpath).click()
xpath = "//button[@id='dataDownload']"
driver.find_element_by_xpath(xpath).click()
print('[INFO] clicked', time.time() - start_time)
time.sleep(10)

print('[INFO] closing', time.time() - start_time)
driver.close()
display.stop()
print('[INFO] end', time.time() - start_time)
火狐:

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.firefox.options import Options
import time

from pyvirtualdisplay import Display

display = Display(visible=0, size=(1920,1080))
display.start()

start_time = time.time()

options = Options()

###options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--disable-extensions")

options.set_preference("browser.download.folderList", 2)
options.set_preference("browser.download.dir", "/home/Documents/testing/macrotrends") # without `r` and `' '`, only `" "` 
options.set_preference("browser.download.useDownloadDir", True)
options.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/csv")

driver = webdriver.Firefox(executable_path="...", options=options)
#driver = webdriver.Firefox(options=options) # I have geckondriver's folder in PATH so I don't have to use `executable_path`

driver.get('https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
print('[INFO] loaded', time.time() - start_time)
time.sleep(5)

iframe = driver.find_element_by_xpath("//iframe[@id='chart_iframe']")
driver.switch_to.frame(iframe)
print('[INFO] switched', time.time() - start_time)

xpath = "//a[text()='All Years']"
driver.find_element_by_xpath(xpath).click()
xpath = "//button[@id='dataDownload']"
driver.find_element_by_xpath(xpath).click()
print('[INFO] clicked', time.time() - start_time)
time.sleep(10)

print('[INFO] closing', time.time() - start_time)
driver.close()
display.stop()

print('[INFO] end', time.time() - start_time)

您可以使用模块
pyvirtualdisplay
创建虚拟显示,该显示将由
Chrome
Firefox
(无
headless
)自动使用,并将隐藏窗口

铬:

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
import time

from pyvirtualdisplay import Display

display = Display(visible=0, size=(1920,1080))
display.start()

start_time = time.time()

options = Options()

###options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--disable-extensions")

options.add_experimental_option("prefs", {
  "download.default_directory": "/home/Documents/testing/macrotrends", # without `r` and `' '`, only `" "`
  "download.prompt_for_download": False,
  "download.directory_upgrade": True,
  "safebrowsing.enabled": False
})

driver = webdriver.Chrome(executable_path=r'/home/chromedriver/chromedriver',options=options)
#driver = webdriver.Chrome(options=options) # I have chromedriver's folder in PATH so I don't have to use `executable_path`

driver.get('https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
print('[INFO] loaded', time.time() - start_time)
time.sleep(5)

iframe = driver.find_element_by_xpath("//iframe[@id='chart_iframe']")
driver.switch_to.frame(iframe)
print('[INFO] switched', time.time() - start_time)

xpath = "//a[text()='All Years']"
driver.find_element_by_xpath(xpath).click()
xpath = "//button[@id='dataDownload']"
driver.find_element_by_xpath(xpath).click()
print('[INFO] clicked', time.time() - start_time)
time.sleep(10)

print('[INFO] closing', time.time() - start_time)
driver.close()
display.stop()
print('[INFO] end', time.time() - start_time)
火狐:

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.firefox.options import Options
import time

from pyvirtualdisplay import Display

display = Display(visible=0, size=(1920,1080))
display.start()

start_time = time.time()

options = Options()

###options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--disable-extensions")

options.set_preference("browser.download.folderList", 2)
options.set_preference("browser.download.dir", "/home/Documents/testing/macrotrends") # without `r` and `' '`, only `" "` 
options.set_preference("browser.download.useDownloadDir", True)
options.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/csv")

driver = webdriver.Firefox(executable_path="...", options=options)
#driver = webdriver.Firefox(options=options) # I have geckondriver's folder in PATH so I don't have to use `executable_path`

driver.get('https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
print('[INFO] loaded', time.time() - start_time)
time.sleep(5)

iframe = driver.find_element_by_xpath("//iframe[@id='chart_iframe']")
driver.switch_to.frame(iframe)
print('[INFO] switched', time.time() - start_time)

xpath = "//a[text()='All Years']"
driver.find_element_by_xpath(xpath).click()
xpath = "//button[@id='dataDownload']"
driver.find_element_by_xpath(xpath).click()
print('[INFO] clicked', time.time() - start_time)
time.sleep(10)

print('[INFO] closing', time.time() - start_time)
driver.close()
display.stop()

print('[INFO] end', time.time() - start_time)

经过一些检查,我发现csv文件并没有保存在某个地方,而是由JS制作和导出的。您可以尝试使用Firefox headless。这可能会有帮助,我刚刚尝试用Chrome打开这个网页(手动,不使用selenium),但下载对我来说根本不起作用。。。控制台中出现错误,“Uncaught TypeError:无法读取HtmlButtoneElement.document.getElementById.onclick处未定义的属性'goal'。我正在使用Chrome77(linux上的测试版)。我在我的帖子中添加了一个Chrome网站的截图,你可以看一下。我在linux上使用76.0.3809.132版(官方版本)(64位)。最新的chromedriver使用Google“headless chrome不下载”解决了这个问题。Chrome在下载Headles时似乎有(或有)问题。我在LinuxMint上测试了Chrome和Firefox,它们都没有在Headles模式下下载。经过一些检查,我发现csv文件没有保存在某个地方,它是由JS制作并导出的。您可以尝试使用Firefox headless。这可能会有帮助,我刚刚尝试用Chrome打开这个网页(手动,不使用selenium),但下载对我来说根本不起作用。。。控制台中出现错误,“Uncaught TypeError:无法读取HtmlButtoneElement.document.getElementById.onclick处未定义的属性'goal'。我正在使用Chrome77(linux上的测试版)。我在我的帖子中添加了一个Chrome网站的截图,你可以看一下。我在linux上使用76.0.3809.132版(官方版本)(64位)。最新的chromedriver使用Google“headless chrome不下载”解决了这个问题。Chrome在下载Headles时似乎有(或有)问题。我在LinuxMint上测试了Chrome和Firefox,但两者都没有在Headles模式下下载。嘿,也谢谢这个方法,我以前从未使用过pyvirtualdisplay,我会尝试一下。谢谢这个方法,我以前从未使用过pyvirtualdisplay,我会尝试一下