释放R中的内存_R_Memory Management

释放R中的内存

r memory-management

释放R中的内存,r,memory-management,R,Memory Management,我使用了一些变量，但当它被使用时，我再也不需要它了，所以我需要删除它并释放内存，但函数rm（）似乎没有帮助： memory.size() 30.69 tmp=matrix(rnorm(6e5*20),6e5,20) memory.size() 207.64 rm(tmp) memory.size() 207.64 这是否意味着我删除了tmp，但内存没有释放？我使用gc（）在操作之间释放RAM。下面是我如何在循环中使用它的示例，但有关gc（）的更详细讨论，以及有关R会话期间内存管理的更多信息，请

我使用了一些变量，但当它被使用时，我再也不需要它了，所以我需要删除它并释放内存，但函数rm（）似乎没有帮助：

memory.size()
30.69
tmp=matrix(rnorm(6e5*20),6e5,20)
memory.size()
207.64
rm(tmp)
memory.size()
207.64

这是否意味着我删除了tmp，但内存没有释放？

我使用

gc（）

在操作之间释放RAM。下面是我如何在循环中使用它的示例，但有关

gc（）

的更详细讨论，以及有关R会话期间内存管理的更多信息，请参见

# load library
library(topicmodels)

# get data
data("AssociatedPress"))

# set number of topics to start with
k <- 20

# set model options
control_LDA_VEM <-
list(estimate.alpha = TRUE, alpha = 50/k, estimate.beta = TRUE,
verbose = 0, prefix = tempfile(), save = 0, keep = 0,
seed = as.integer(100), nstart = 1, best = TRUE,
var = list(iter.max = 10, tol = 10^-6),
em = list(iter.max = 10, tol = 10^-4),
initialize = "random")


# create the sequence that stores the number of topics to 
# iterate over
sequ <- seq(20, 300, by = 20)

# basic loop to iterate over different topic numbers with gc
# after each run to empty out RAM
lda <- vector(mode='list', length = length(sequ))
for(k in sequ) {
  lda[[k]] <- LDA(AssociatedPress[1:20,], k, method= "VEM", control = control_LDA_VEM)
  gc() # here's where I put the garbage collection to free up memory before the next round of the loop
}

# convert list output to dataframe (suggestions for a simpler method are welcome!)
best.model.logLik <- data.frame(logLik = as.matrix(lapply(lda[sequ], logLik)), ntopic = sequ)

# plot
with(best.model.logLik, plot(ntopic, logLik, type = 'l', xlab="Number of topics", ylab="Log likelihood"))

#加载库
库（topicmodels）
#获取数据
数据（“AssociatedPress”））
#设置要开始的主题数
k在gc（）
之后会发生什么？太好了！gc（）是我需要的！很抱歉，我还有一个问题，当我运行程序时，使用的内存越来越大，所以我有必要在程序中添加一些gc（）吗？我的意思是在我的代码中添加一些gc（），比如#code#gc（）#code#gc（）#code#，这会有帮助吗？不，不需要，gc是由后台进程在特定的时间间隔调用的。避免内存问题的最佳方法是将代码分成许多较小的函数，只返回所需的元素。下次R进程进行垃圾收集时，函数中的所有其他内容都应自动处理。@Hansi这并不总是正确的，看这里的讨论，这是R地狱第二圈的经典案例。切勿在循环中增长向量（或对象）lda@mnel感谢您的指导性和详细的评论！我已经相应地编辑了我的答案。我应该补充一点，从20个主题开始20个文档有点愚蠢。我本来应该从比文档更少的主题开始，例如我的计算机中的sequ gc（）释放了一些内存，但它并不完美。如果我加载一个大对象，对它做些什么，删除它并使用gc（），我就不会得到与开始时相同的可用内存。我做的事情越多，我无法恢复的记忆就越多。最后，在对大对象进行多次操作后，我可能会耗尽内存。我在Windows10x64中使用16GB内存。这个问题对我很有用。我认为@skan最后的评论值得复制。斯坎，如果你已经有了答案，请发到这里。
# print ordered dataframe to see which number of topics has the highest log likelihood
(best.model.logLik.sort <- best.model.logLik[order(-as.numeric(best.model.logLik$logLik)), ]) 
    logLik       ntopic
2  -17904.12     40
3  -18105.48     60
1  -18181.84     20
4   -18569.7     80
5  -19736.94    100
6   -21919.6    120
7  -23785.08    140
8  -24914.23    160
9  -25493.76    180
10 -25837.64    200
11 -25964.23    220
12 -26061.01    240
13 -26117.92    260
14 -26149.44    280
15 -26168.91    300