如何在joblib并行进程中使用pickle保存大型python对象

如何在joblib并行进程中使用pickle保存大型python对象,python,pickle,joblib,Python,Pickle,Joblib,我试图在每个joblib并行进程中用pickle二进制文件保存大对象(~7GB)。但是,Joblib会引发MemoryError 我有足够的RAM(256GB)和存储器(4TB)。我已经分配了joblib 12个核心。我已经监控了内存,内存还可以(总内存的一半以上仍然是空的) 代码的结构很简单,就像 import pickle from joblib import Parallel, delayed def do_something(arg1, arg2): ... pickl

我试图在每个joblib并行进程中用pickle二进制文件保存大对象(~7GB)。但是,Joblib会引发MemoryError

我有足够的RAM(256GB)和存储器(4TB)。我已经分配了joblib 12个核心。我已经监控了内存,内存还可以(总内存的一半以上仍然是空的)

代码的结构很简单,就像

import pickle
from joblib import Parallel, delayed

def do_something(arg1, arg2):
    ...
    pickle.dump(save_somthing, open('somefile.p','wb'), protocol=-1)
    return 1

JobList = ['a1','b1','c1','d1',\
           'a2','b2','c2','d2',\
           'a3','b3','c3','d3']
arg2 = 'sth'
Parallel(n_jobs=12)(delayed(do_somthing)(i, 'sth') for i in JobList)
我希望它能正常结束我的工作,但我不知道如何分配(或允许)joblib使用更多内存

++) 环境 操作系统:Ubuntu 18.04.2(64位) Python:Python 3.6.8(GCC 7.3.0)

joblib.externals.loky.process\u executor.\u远程回溯:
"""
回溯(最近一次呼叫最后一次):
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/externals/loky/process_executor.py”,第418行,in_process_worker
r=调用_项()
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/externals/loky/process_executor.py”,第272行,在调用中__
返回self.fn(*self.args,**self.kwargs)
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblb/_parallel_backends.py”,第567行,在调用中__
返回self.func(*args,**kwargs)
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/parallel.py”,第225行,在调用中__
对于self.items中的func、args、kwargs]
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/parallel.py”,第225行,在
对于self.items中的func、args、kwargs]
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/memory.py”,第568行,在调用中__
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/memory.py”,第534行,在缓存调用中
out,metadata=self.call(*args,**kwargs)
调用中的文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/memory.py”,第734行
输出=self.func(*args,**kwargs)
文件“02_trj_ConvertToPickle.py”,第65行,to_pickle
configArray=np.0((nAtoms,9))
记忆者
"""
上述异常是以下异常的直接原因:
回溯(最近一次呼叫最后一次):
文件“02_trj_ConvertToPickle.py”,第106行,在
res=Parallel(n_jobs=numpucores,verbose=32)(文件列表中i的延迟(到pickle)(i,directoryBufferProcessing)
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/parallel.py”,第934行,在调用中__
self.retrieve()
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblib/parallel.py”,第833行,在检索中
self.\u output.extend(job.get(timeout=self.timeout))
文件“/home/yjw0510/anaconda3/lib/python3.6/site packages/joblb/\u parallel\u backends.py”,第521行,在wrap\u future\u result中
返回future.result(超时=超时)
文件“/home/yjw0510/anaconda3/lib/python3.6/concurrent/futures/_base.py”,结果中的第425行
返回self.\u获取\u结果()
文件“/home/yjw0510/anaconda3/lib/python3.6/concurrent/futures/_base.py”,第384行,在获取结果中
提出自己的意见
记忆者

此类任务对以下各项非常敏感:a)操作系统,b)python版本和32/64位。请加上information@AlexYu我已经添加了信息,但我不能100%确定,而且我从来没有使用过
pickle
-处理真正大的文件。AFAIK pickle从未打算使用如此大的文件。你能:a)发布错误的完整堆栈跟踪,b)你能尝试
pickle.dump
这样的文件而不使用joblib吗?甚至可能吗?是的,pickle确实使用
protocol=-1
选项转储了超过4GiB的大小(并且,在没有
joblib
的情况下,我测试了
pickle.dump
,程序正常地执行了
pickle.dump
)。检查错误的完整打印需要我再次运行程序。这需要一天的时间。我会做并更新它。@AlexYu幸运的是,我没有关闭引发错误的终端。所以我更新了帖子。这类任务对:a)操作系统,b)python版本和32/64位非常敏感。请加上information@AlexYu我已经添加了信息,但我不能100%确定,而且我从来没有使用过
pickle
-处理真正大的文件。AFAIK pickle从未打算使用如此大的文件。你能:a)发布错误的完整堆栈跟踪,b)你能尝试
pickle.dump
这样的文件而不使用joblib吗?甚至可能吗?是的,pickle确实使用
protocol=-1
选项转储了超过4GiB的大小(并且,在没有
joblib
的情况下,我测试了
pickle.dump
,程序正常地执行了
pickle.dump
)。检查错误的完整打印需要我再次运行程序。这需要一天的时间。我会做并更新它。@AlexYu幸运的是,我没有关闭引发错误的终端。所以我更新了帖子。
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
    r = call_item()
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/memory.py", line 568, in __call__
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/memory.py", line 534, in _cached_call
    out, metadata = self.call(*args, **kwargs)
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/memory.py", line 734, in call
    output = self.func(*args, **kwargs)
  File "02_trj_ConvertToPickle.py", line 65, in to_pickle
    configArray = np.zeros((nAtoms,9))
MemoryError
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "02_trj_ConvertToPickle.py", line 106, in <module>
    res = Parallel(n_jobs=numCPUcores,verbose=32)(delayed(to_pickle)(i, directoryBufferProcessing) for i in fileList)
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/yjw0510/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/yjw0510/anaconda3/lib/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/home/yjw0510/anaconda3/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
MemoryError