Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python &引用;Can';t pickle<;类型'_csv.reader'&燃气轮机&引用;在Windows上使用多处理时出错_Python - Fatal编程技术网

Python &引用;Can';t pickle<;类型'_csv.reader'&燃气轮机&引用;在Windows上使用多处理时出错

Python &引用;Can';t pickle<;类型'_csv.reader'&燃气轮机&引用;在Windows上使用多处理时出错,python,Python,我正在编写一个多处理程序,使用Windows并行处理一个大的.CSV文件 我发现了一个类似的问题。 在Windows下运行时,我收到一个错误,即csv.reader不可拾取 我想我可以在reader子进程中打开CSV文件,然后从父进程向其发送文件名。 但是,我想传递一个已经打开的CSV文件(就像代码应该做的那样),它有一个特定的状态,即真正使用一个共享对象 你知道在Windows下怎么做吗?或者那里缺少什么 这是代码(为了便于阅读,我重新发布): 在windows下运行时,我收到以下错误: Tr

我正在编写一个多处理程序,使用Windows并行处理一个大的.CSV文件

我发现了一个类似的问题。 在Windows下运行时,我收到一个错误,即csv.reader不可拾取

我想我可以在reader子进程中打开CSV文件,然后从父进程向其发送文件名。 但是,我想传递一个已经打开的CSV文件(就像代码应该做的那样),它有一个特定的状态,即真正使用一个共享对象

你知道在Windows下怎么做吗?或者那里缺少什么

这是代码(为了便于阅读,我重新发布):

在windows下运行时,我收到以下错误:

Traceback (most recent call last):
  File "C:\Users\ron.berman\Documents\Attribution\ubrShapley\test.py", line 130, in <module>
    main(sys.argv[1:])
  File "C:\Users\ron.berman\Documents\Attribution\ubrShapley\test.py", line 127, in main
    c = CSVWorker(opts.numprocs, args[0], args[1])
  File "C:\Users\ron.berman\Documents\Attribution\ubrShapley\test.py", line 44, in __init__
    self.pin.start()
  File "C:\Python27\lib\multiprocessing\process.py", line 130, in start
    self._popen = Popen(self)
  File "C:\Python27\lib\multiprocessing\forking.py", line 271, in __init__
    dump(process_obj, to_child, HIGHEST_PROTOCOL)
  File "C:\Python27\lib\multiprocessing\forking.py", line 193, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Python27\lib\pickle.py", line 224, in dump
    self.save(obj)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Python27\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\multiprocessing\forking.py", line 66, in dispatcher
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 401, in save_reduce
    save(args)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 548, in save_tuple
    save(element)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Python27\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 396, in save_reduce
    save(cls)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 753, in save_global
    (obj, module, name))
pickle.PicklingError: Can't pickle <type '_csv.reader'>: it's not the same object as _csv.reader
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Python27\lib\multiprocessing\forking.py", line 374, in main
    self = load(from_parent)
  File "C:\Python27\lib\pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "C:\Python27\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\Python27\lib\pickle.py", line 880, in load_eof
    raise EOFError
EOFError
回溯(最近一次呼叫最后一次):
文件“C:\Users\ron.berman\Documents\attribute\ubrShapley\test.py”,第130行,在
main(sys.argv[1:])
文件“C:\Users\ron.berman\Documents\attribute\ubrShapley\test.py”,第127行,主目录
c=CSVWorker(opts.numprocs,args[0],args[1])
文件“C:\Users\ron.berman\Documents\attributes\ubrShapley\test.py”,第44行,在\uuu init中__
self.pin.start()
文件“C:\Python27\lib\multiprocessing\process.py”,第130行,在开始处
self.\u popen=popen(self)
文件“C:\Python27\lib\multiprocessing\forking.py”,第271行,在uu init中__
转储(进程对象、到子进程、最高\u协议)
转储文件“C:\Python27\lib\multiprocessing\forking.py”,第193行
ForkingPickler(文件、协议).dump(obj)
文件“C:\Python27\lib\pickle.py”,第224行,位于转储文件中
自我保存(obj)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第419行,在save\u reduce中
保存(状态)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第649行的文件“C:\Python27\lib\pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“C:\Python27\lib\pickle.py”,第681行,在批处理设置项中
保存(v)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
dispatcher中第66行的文件“C:\Python27\lib\multiprocessing\forking.py”
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第401行,在save\u reduce中
保存(args)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
文件“C:\Python27\lib\pickle.py”,第548行,在save\u tuple中
保存(元素)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第419行,在save\u reduce中
保存(状态)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
保存目录中第649行的文件“C:\Python27\lib\pickle.py”
self.\u batch\u setitems(obj.iteritems())
文件“C:\Python27\lib\pickle.py”,第681行,在批处理设置项中
保存(v)
文件“C:\Python27\lib\pickle.py”,第331行,保存
自我保存(obj=obj,*rv)
文件“C:\Python27\lib\pickle.py”,第396行,在save\u reduce中
保存(cls)
文件“C:\Python27\lib\pickle.py”,第286行,保存
f(self,obj)#用显式self调用未绑定方法
文件“C:\Python27\lib\pickle.py”,第753行,在save\u global中
(对象、模块、名称))
pickle.PicklingError:无法pickle:它与_csv.reader不是同一个对象
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“C:\Python27\lib\multiprocessing\forking.py”,第374行,在main中
self=加载(从父级)
加载文件“C:\Python27\lib\pickle.py”,第1378行
返回Unpickler(file.load())
加载文件“C:\Python27\lib\pickle.py”,第858行
调度[键](自身)
文件“C:\Python27\lib\pickle.py”,第880行,在load\u eof中
提高采收率
伊奥费罗

由于多处理依赖于序列化和反序列化对象,因此当在进程之间作为参数传递时,您的代码依赖于在进程周围传递CSVWorker的实例(该实例表示为“self”),因此出现此错误-因为csv读卡器和打开的文件都可以被pickle

您提到您的CSV很大,我不认为将所有数据读取到列表中对您来说是一个解决方案-因此您必须考虑一种方法,将输入CSV中的一行一次传递给每个工作进程,并从每个工作进程中检索已处理的行,并在主进程上执行所有I/O

这看起来像是多重处理。池将是编写应用程序的更好方法-
请在查看多处理文档,并尝试使用进程池和pool.map来处理您的CSV。它还负责保存顺序,这将消除代码中的许多复杂逻辑。

您遇到的问题是由于使用CSVWorker类的方法作为流程目标而引起的;该类的成员不能被腌制;那些打开的文件永远不会工作

你要做的是把这个班分成两个班;一个协调所有辅助子流程,另一个实际执行计算工作。辅助进程将文件名作为参数,并根据需要打开各个文件,或者至少等到调用了辅助进程方法后再打开文件。它们还可以将
多处理.Queue
作为参数或实例成员;那是安全的


在某种程度上,你已经做到了;您的
write\u output\u csv
方法正在子流程中打开文件及其文件,但是
parse\u input\u csv
方法希望找到一个已打开并准备好的文件作为
self
的属性。坚持用另一种方法操作,你的状态应该会很好。

我一直在尝试使用multiprocessing.Pool,给定一个大的.csv文件(1.4亿行左右),它似乎在阻止构建一个
Traceback (most recent call last):
  File "C:\Users\ron.berman\Documents\Attribution\ubrShapley\test.py", line 130, in <module>
    main(sys.argv[1:])
  File "C:\Users\ron.berman\Documents\Attribution\ubrShapley\test.py", line 127, in main
    c = CSVWorker(opts.numprocs, args[0], args[1])
  File "C:\Users\ron.berman\Documents\Attribution\ubrShapley\test.py", line 44, in __init__
    self.pin.start()
  File "C:\Python27\lib\multiprocessing\process.py", line 130, in start
    self._popen = Popen(self)
  File "C:\Python27\lib\multiprocessing\forking.py", line 271, in __init__
    dump(process_obj, to_child, HIGHEST_PROTOCOL)
  File "C:\Python27\lib\multiprocessing\forking.py", line 193, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Python27\lib\pickle.py", line 224, in dump
    self.save(obj)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Python27\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\multiprocessing\forking.py", line 66, in dispatcher
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 401, in save_reduce
    save(args)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 548, in save_tuple
    save(element)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 419, in save_reduce
    save(state)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "C:\Python27\lib\pickle.py", line 681, in _batch_setitems
    save(v)
  File "C:\Python27\lib\pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Python27\lib\pickle.py", line 396, in save_reduce
    save(cls)
  File "C:\Python27\lib\pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Python27\lib\pickle.py", line 753, in save_global
    (obj, module, name))
pickle.PicklingError: Can't pickle <type '_csv.reader'>: it's not the same object as _csv.reader
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Python27\lib\multiprocessing\forking.py", line 374, in main
    self = load(from_parent)
  File "C:\Python27\lib\pickle.py", line 1378, in load
    return Unpickler(file).load()
  File "C:\Python27\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\Python27\lib\pickle.py", line 880, in load_eof
    raise EOFError
EOFError