如何在R future(Furr)包中正确使用集群计划
我目前正在使用如何在R future(Furr)包中正确使用集群计划,r,parallel-processing,r-future,furrr,R,Parallel Processing,R Future,Furrr,我目前正在使用furr创建更有组织的模型执行。我使用data.frame将参数有序地传递给函数,然后使用furr::future\u map()跨所有参数映射函数。在我的本地机器(OSX)上使用顺序和多核期货时,该功能可以完美地工作 现在,我想测试创建自己的AWS实例集群的代码(如图所示) 我使用链接文章代码创建了一个函数: make_cluster_ec2 <- function(public_ip){ ssh_private_key_file <- Sys.getenv
furr
创建更有组织的模型执行。我使用data.frame
将参数有序地传递给函数,然后使用furr::future\u map()
跨所有参数映射函数。在我的本地机器(OSX)上使用顺序和多核期货时,该功能可以完美地工作
现在,我想测试创建自己的AWS实例集群的代码(如图所示)
我使用链接文章代码创建了一个函数:
make_cluster_ec2 <- function(public_ip){
ssh_private_key_file <- Sys.getenv('PEM_PATH')
github_pac <- Sys.getenv('PAC')
cl_multi <- future::makeClusterPSOCK(
workers = public_ip,
user = "ubuntu",
rshopts = c(
"-o", "StrictHostKeyChecking=no",
"-o", "IdentitiesOnly=yes",
"-i", ssh_private_key_file
),
rscript_args = c(
"-e", shQuote("local({p <- Sys.getenv('R_LIBS_USER'); dir.create(p, recursive = TRUE, showWarnings = FALSE); .libPaths(p)})"),
"-e", shQuote("install.packages('devtools')"),
"-e", shQuote(glue::glue("devtools::install_github('user/repo', auth_token = '{github_pac}')"))
),
dryrun = FALSE)
return(cl_multi)
}
但当我使用集群计划运行代码时:
plan(list(tweak(cluster, workers = cls), multisession))
parameter_df %>%
mutate(model_traj = furrr::future_pmap(list('lat' = latitude,
'lon' = longitude,
'height' = stack_height,
'name_source' = facility_name,
'id_source' = facility_id,
'duration' = duration,
'days' = seq_dates,
'daily_hours' = daily_hours,
'direction' = 'forward',
'met_type' = 'reanalysis',
'met_dir' = here::here('met'),
'exec_dir' = here::here("Hysplit4/exec"),
'cred'= list(creds)),
dirtywind::hysplit_trajectory,
.progress = TRUE)
)
我得到以下错误:
Error in file(temp_file, "a") : cannot open the connection
In addition: Warning message:
In file(temp_file, "a") :
cannot open file '/var/folders/rc/rbmg32js2qlf4d7cd4ts6x6h0000gn/T//RtmpPvdbV3/filecf23390c093.txt': No such file or directory
我不知道引擎盖下发生了什么,我也不能traceback()。我已经用本文中的例子测试了这种联系,事情似乎运行正常。我想知道为什么在执行期间尝试创建tempdir
。我错过了什么
(这也是furr
repo中的一个选项)禁用进度条,即不指定.progress=TRUE
这是因为.progress=TRUE
。这通常只有在同一台机器上并行时才可能
此错误的一个较小示例是:
library(future)
## Set up a cluster with one worker running on another machine
cl <- makeClusterPSOCK(workers = "node2")
plan(cluster, workers = cl)
y <- furrr::future_map(1:2, identity, .progress = FALSE)
str(y)
## List of 2
## $ : int 1
## $ : int 2
y <- furrr::future_map(1:2, identity, .progress = TRUE)
## Error in file(temp_file, "a") : cannot open the connection
## In addition: Warning message:
## In file(temp_file, "a") :
## cannot open file '/tmp/henrik/Rtmp1HkyJ8/file4c4b864a028ac.txt': No such file or directory
库(未来)
##设置一个群集,其中一个工作进程在另一台计算机上运行
cl禁用进度条,即不指定。进度=真
这是因为.progress=TRUE
。这通常只有在同一台机器上并行时才可能
此错误的一个较小示例是:
library(future)
## Set up a cluster with one worker running on another machine
cl <- makeClusterPSOCK(workers = "node2")
plan(cluster, workers = cl)
y <- furrr::future_map(1:2, identity, .progress = FALSE)
str(y)
## List of 2
## $ : int 1
## $ : int 2
y <- furrr::future_map(1:2, identity, .progress = TRUE)
## Error in file(temp_file, "a") : cannot open the connection
## In addition: Warning message:
## In file(temp_file, "a") :
## cannot open file '/tmp/henrik/Rtmp1HkyJ8/file4c4b864a028ac.txt': No such file or directory
库(未来)
##设置一个群集,其中一个工作进程在另一台计算机上运行
氯
library(future)
## Set up a cluster with one worker running on another machine
cl <- makeClusterPSOCK(workers = "node2")
plan(cluster, workers = cl)
y <- furrr::future_map(1:2, identity, .progress = FALSE)
str(y)
## List of 2
## $ : int 1
## $ : int 2
y <- furrr::future_map(1:2, identity, .progress = TRUE)
## Error in file(temp_file, "a") : cannot open the connection
## In addition: Warning message:
## In file(temp_file, "a") :
## cannot open file '/tmp/henrik/Rtmp1HkyJ8/file4c4b864a028ac.txt': No such file or directory