Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/79.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Docker并行运行RSelenium_R_Docker_Rselenium_Doparallel_Selenium Remotedriver - Fatal编程技术网

使用Docker并行运行RSelenium

使用Docker并行运行RSelenium,r,docker,rselenium,doparallel,selenium-remotedriver,R,Docker,Rselenium,Doparallel,Selenium Remotedriver,我目前正在尝试使用doParallel软件包,以便将运行在Docker上的RSelenium web刮板并行化。我找到了这篇文章,并在这里复制了@hdharrison提供的答案: library(RSelenium) library(rvest) library(magrittr) library(foreach) library(doParallel) # using docker run -d -p 4445:4444 selenium/standalone-chrome:3.5.3 #

我目前正在尝试使用doParallel软件包,以便将运行在Docker上的RSelenium web刮板并行化。我找到了这篇文章,并在这里复制了@hdharrison提供的答案:

library(RSelenium)
library(rvest)
library(magrittr)
library(foreach)
library(doParallel)

# using  docker run -d -p 4445:4444 selenium/standalone-chrome:3.5.3
# in windows
URLsPar <- c("https://stackoverflow.com/", "https://github.com/", 
             "http://www.bbc.com/", "http://www.google.com", 
             "https://www.r-project.org/", "https://cran.r-project.org",
             "https://twitter.com/", "https://www.facebook.com/")

appHTML <- c()

(cl <- (detectCores() - 1) %>%  makeCluster) %>% registerDoParallel
# open a remoteDriver for each node on the cluster
clusterEvalQ(cl, {
  library(RSelenium)
  remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445L, 
                        browserName = "chrome")
  remDr$open()
})
ws <- foreach(x = 1:length(URLsPar), 
              .packages = c("rvest", "magrittr", "RSelenium"))  %dopar%  {
                print(URLsPar[x])
                remDr$navigate(URLsPar[x])
                remDr$getTitle()[[1]]
              }
> ws
[[1]]
[1] "Stack Overflow - Where Developers Learn, Share, & Build Careers"

[[2]]
[1] "The world's leading software development platform · GitHub"

[[3]]
[1] "BBC - Homepage"

[[4]]
[1] "Google"

[[5]]
[1] "R: The R Project for Statistical Computing"

[[6]]
[1] "The Comprehensive R Archive Network"

[[7]]
[1] "Twitter. It's what's happening."

[[8]]
[1] "Facebook - Log In or Sign Up"     


# close browser on each node
clusterEvalQ(cl, {
  remDr$close()
})

stopImplicitCluster()
我知道我必须为每个核心打开一个新的浏览器,但我认为这就是问题所在:在减少核心的那一刻,产生的错误就更少了


如果我能提供更多的细节,请让我知道!提前谢谢

与此同时,我能够想出如何纠正我的错误。如果其他人也面临同样的问题,我会在这里发表评论。我无法解释它背后的逻辑,但当我替换代码时,我的代码会按预期运行

remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445L, 
                       browserName = "chrome")

使用Firefox浏览器而不是Chrome浏览器

remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445L, 
                       browserName = "chrome")
remDr <- remoteDriver(port = 4445L)