Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/298.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python multiprocessing.Pool.map()删除子类ndarray的属性_Python_Numpy_Subclass_Python Multiprocessing - Fatal编程技术网

Python multiprocessing.Pool.map()删除子类ndarray的属性

Python multiprocessing.Pool.map()删除子类ndarray的属性,python,numpy,subclass,python-multiprocessing,Python,Numpy,Subclass,Python Multiprocessing,在numpy.ndarray-子类的实例列表上使用multiprocessing.Pool()中的map()时,会删除自己类的新属性 以下基于的最小示例再现了该问题: from multiprocessing import Pool import numpy as np class MyArray(np.ndarray): def __new__(cls, input_array, info=None): obj = np.asarray(input_array).

numpy.ndarray
-子类的实例列表上使用
multiprocessing.Pool()
中的
map()
时,会删除自己类的新属性

以下基于的最小示例再现了该问题:

from multiprocessing import Pool
import numpy as np


class MyArray(np.ndarray):

    def __new__(cls, input_array, info=None):
        obj = np.asarray(input_array).view(cls)
        obj.info = info
        return obj

    def __array_finalize__(self, obj):
        if obj is None: return
        self.info = getattr(obj, 'info', None)

def sum_worker(x):
    return sum(x) , x.info

if __name__ == '__main__':
    arr_list = [MyArray(np.random.rand(3), info=f'foo_{i}') for i in range(10)]
    with Pool() as p:
        p.map(sum_worker, arr_list)
属性
info
被删除

AttributeError: 'MyArray' object has no attribute 'info'
使用内置的
map()

arr_list = [MyArray(np.random.rand(3), info=f'foo_{i}') for i in range(10)]
list(map(sum_worker, arr_list2))
方法
\u\u数组\u finalize\u()
的目的是对象在切片后保留属性

arr = MyArray([1,2,3], info='foo')
subarr = arr[:2]
print(subarr.info)

但是对于
Pool.map()
这个方法不知何故是不起作用的…

因为多处理使用
pickle
将数据序列化到单独的进程中或从单独的进程中序列化,这本质上是一个复制

根据该问题调整公认的解决方案,您的示例变成:

from multiprocessing import Pool
import numpy as np

class MyArray(np.ndarray):

    def __new__(cls, input_array, info=None):
        obj = np.asarray(input_array).view(cls)
        obj.info = info
        return obj

    def __array_finalize__(self, obj):
        if obj is None: return
        self.info = getattr(obj, 'info', None)

    def __reduce__(self):
        pickled_state = super(MyArray, self).__reduce__()
        new_state = pickled_state[2] + (self.info,)
        return (pickled_state[0], pickled_state[1], new_state)

    def __setstate__(self, state):
        self.info = state[-1]
        super(MyArray, self).__setstate__(state[0:-1])

def sum_worker(x):
    return sum(x) , x.info

if __name__ == '__main__':
    arr_list = [MyArray(np.random.rand(3), info=f'foo_{i}') for i in range(10)]
    with Pool() as p:
        p.map(sum_worker, arr_list)
注意,第二个答案建议您可以使用
pathos.multi-processing
和您未经调整的原始代码,因为pathos使用
dill
而不是
pickle
。但是,当我测试它时,它不起作用