Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/selenium/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用Selenium和Firefox版本40,如何下载文件?_Python_Selenium_Firefox_Download_Rselenium - Fatal编程技术网

Python 使用Selenium和Firefox版本40,如何下载文件?

Python 使用Selenium和Firefox版本40,如何下载文件?,python,selenium,firefox,download,rselenium,Python,Selenium,Firefox,Download,Rselenium,通过Selenium下载文件的旧方法似乎不再有效 我的代码是: fp = webdriver.FirefoxProfile() fp.set_preference("browser.download.dir", os.getcwd()) fp.set_preference("browser.download.folderList", 2) fp.set_preference("browser.download.manager.showWhenStarting",

通过Selenium下载文件的旧方法似乎不再有效

我的代码是:

    fp = webdriver.FirefoxProfile()
    fp.set_preference("browser.download.dir", os.getcwd())
    fp.set_preference("browser.download.folderList", 2)
    fp.set_preference("browser.download.manager.showWhenStarting", False)
    fp.set_preference("browser.helperApps.neverAsk.saveToDisk",
                      "application/pdf")

    self.driver = webdriver.Firefox(firefox_profile=fp)
    self.longMessage = True
但是,“文件”对话框仍会出现。我已经做了相当多的字段开关操作,但经过一番挖掘,我发现Selenium生成的默认Firefox配置文件的
prefs.js
文件与我手动选中“从现在开始自动为这种类型的文件执行此操作”的文件的
prefs.js
文件之间没有区别在下载对话框中

mimeTypes.rdf
文件确实发生了更改,但添加了以下行:

<RDF:Description RDF:about="urn:mimetype:handler:application/pdf"
               NC:alwaysAsk="false"
               NC:saveToDisk="true"
               NC:handleInternal="false">
<NC:externalApplication RDF:resource="urn:mimetype:externalApplication:application/pdf"/>

但是,我不知道在创建新的Firefox配置文件时如何设置自定义mimeTypes.rdf文件。有人知道吗


为了防止任何人建议我只卷曲下载URL,文件是为用户生成的,我需要特别验证.pdf文件是否已下载到驱动器。

您可以通过链接创建从internet下载文件的其他方法

我的c#代码示例:

如您所见,在这段代码中,我从image元素中获取src属性,并从浏览器外部下载它,以获得绝对正确的位图图像(之后,我可以将其保存到HDD)。
通过同样的方式,您可以从links=)下载任何文件。

我是R用户,所以只需使用R发布我的解决方案即可。如果您不能在python中转换相同的代码,请告诉我,我将提示您使用相同的代码

known_formats <- c("application/vnd.ms-excel","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")


firefox_profile.me <- makeFirefoxProfile(list(marionette = TRUE,
                                              # this is for certificate issues [may be ignored]
                                              webdriver_accept_untrusted_certs = TRUE,
                                              webdriver_assume_untrusted_issuer = TRUE,
                                              # download related settings
                                              browser.download.folderList = 2L,
                                              browser.download.manager.showWhenStarting = FALSE,
                                              # put your path here but remember to give path like C:\\DirOfYourChoice and not C:\\DirOfYourChoice\\ [last \\ is not going to work]
                                              browser.download.dir = normalizePath("TestDL"),
                                              browser.helperApps.alwaysAsk.force = FALSE,
                                              browser.helperApps.neverAsk.openFile = paste0(known_formats, collapse = ","),
                                              browser.helperApps.neverAsk.saveToDisk = paste0(known_formats, collapse = ","),
                                              browser.download.manager.showWhenStarting = FALSE,
                                              # this is for marionette and related security
                                              "browser.tabs.remote.force-enable" = TRUE,
                                              pdfjs.disabled = TRUE))

remDr <- remoteDriver(remoteServerAddr = "localhost",
                      port = 4444,
                      browserName = "firefox",
                      extraCapabilities = firefox_profile.me)

remDr$open()

remDr$navigate("https://www.google.com/search?q=sample+xlsx")

remDr$findElement(using = "css selector", value = ".g:nth-child(1) a")$clickElement()

remDr$navigate("https://www.google.com/search?q=test+xls")

remDr$findElement(using = "css selector", value = ".g:nth-child(1) a")$clickElement()
希望您能够将其移植到python。只需尝试在firefox
makeFirefoxProfile

进一步理解的参考资料:-


在我的原始帖子中,我试图避开这个答案。该文件是在用户单击按钮时生成的。我没有办法抢先获得下载链接:(@Staunch,你可以点击按钮获得链接并下载。有什么问题吗?你如何获得链接?这是一个
javascript
方法和一个
ajax
请求。我也在努力寻找解决方案。我发现我使用的是Firefox版本50.1.0。当出现提示时,RSelenium.未能下载。然而,在一些情况下它起作用了。我会写一封回复,对同样的内容进行重新评分。
known_formats <- c("application/vnd.ms-excel","application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")


firefox_profile.me <- makeFirefoxProfile(list(marionette = TRUE,
                                              # this is for certificate issues [may be ignored]
                                              webdriver_accept_untrusted_certs = TRUE,
                                              webdriver_assume_untrusted_issuer = TRUE,
                                              # download related settings
                                              browser.download.folderList = 2L,
                                              browser.download.manager.showWhenStarting = FALSE,
                                              # put your path here but remember to give path like C:\\DirOfYourChoice and not C:\\DirOfYourChoice\\ [last \\ is not going to work]
                                              browser.download.dir = normalizePath("TestDL"),
                                              browser.helperApps.alwaysAsk.force = FALSE,
                                              browser.helperApps.neverAsk.openFile = paste0(known_formats, collapse = ","),
                                              browser.helperApps.neverAsk.saveToDisk = paste0(known_formats, collapse = ","),
                                              browser.download.manager.showWhenStarting = FALSE,
                                              # this is for marionette and related security
                                              "browser.tabs.remote.force-enable" = TRUE,
                                              pdfjs.disabled = TRUE))

remDr <- remoteDriver(remoteServerAddr = "localhost",
                      port = 4444,
                      browserName = "firefox",
                      extraCapabilities = firefox_profile.me)

remDr$open()

remDr$navigate("https://www.google.com/search?q=sample+xlsx")

remDr$findElement(using = "css selector", value = ".g:nth-child(1) a")$clickElement()

remDr$navigate("https://www.google.com/search?q=test+xls")

remDr$findElement(using = "css selector", value = ".g:nth-child(1) a")$clickElement()
Firefox 50.1.0 [while I'm writing this post]
Selenium [3.0.1]
R [3.3.2 (2016-10-31)]