多次调用Parlappy时的群集内存使用情况

多次调用Parlappy时的群集内存使用情况,r,memory,lapply,rparallel,R,Memory,Lapply,Rparallel,如果多次调用parlappy,只调用一次makeCluster和stopCluster可以吗,还是应该在每次parlappy调用之前和之后调用它们?这会如何影响内存使用 下面是一个玩具示例: library(parallel) my_g1 <- function(list_element) { return(sum(list_element)) } my_g2 <- function(list_element, my_parameter) { return(max

如果多次调用parlappy,只调用一次
makeCluster
stopCluster
可以吗,还是应该在每次
parlappy
调用之前和之后调用它们?这会如何影响内存使用

下面是一个玩具示例:

library(parallel)

my_g1 <- function(list_element) {
    return(sum(list_element))
}

my_g2 <- function(list_element, my_parameter) {
    return(max(list_element) + my_parameter)
}

my_fn <- function(large_list, max_iterations=10, my_parameter=123) {
    stopifnot(max_iterations >= 1)
    iteration <- 1
    while(TRUE) {
        message("iteration ", iteration)
        list_of_sums <- lapply(my_large_list, my_g1)
        list_of_max_plus_parameter <- lapply(my_large_list, my_g2, my_parameter=my_parameter)
        stopifnot(list_of_max_plus_parameter[[1]] == max(large_list[[1]]) + my_parameter)
        ## Pretend there's work to do with list_of*: check for convergence; if converged, break
        iteration <- iteration + 1
        if(iteration >= max_iterations) break
    }
    return(1)  # Pretend this has something to do with the work done in the loop
}

my_large_list <- list(seq(1, 10),
                      seq(99, 157),
                      seq(27, 54),
                      seq(1001, 1041))  # Pretend this takes up lots of memory, want to avoid copying

unused <- my_fn(my_large_list)
stopCluster
处于循环之外时,
my_large_list
是否会被复制多次,并且在调用
stopCluster
之前不会释放内存?换句话说,
my\u large\u list
的内存使用量是否会达到
2*max\u迭代次数的顺序?或者相对于
max\u迭代次数而言,它是否是常数

my_fn_parallelized <- function(large_list, max_iterations=10, my_parameter=123) {
    stopifnot(max_iterations >= 1)
    cluster <- makeCluster(2)  # Two cores
    iteration <- 1
    while(TRUE) {
        message("iteration ", iteration)
        list_of_sums <- parLapply(cluster, my_large_list, my_g1)
        list_of_max_plus_parameter <- parLapply(cluster, my_large_list, my_g2,
                                                my_parameter=my_parameter)
        stopifnot(list_of_max_plus_parameter[[1]] == max(large_list[[1]]) + my_parameter)
        ## Pretend there's work to do with list_of*: check for convergence; if converged, break
        iteration <- iteration + 1
        if(iteration >= max_iterations) break
    }
    stopCluster(cluster)  # With stopCluster here, is my_large_list copied 2*max_iterations times?
    return(1)  # Pretend this has something to do with the work done in the loop
}

unused <- my_fn_parallelized(my_large_list)