Apache spark 火花上的TensorFlow:Can';t pickle局部对象循环
我有一个独立的Spark集群,并尝试使用Python在其上运行。到目前为止,我只尝试了非常简单的示例,但我总是遇到相同的问题:每个工作人员都会出现相同的错误消息:Apache spark 火花上的TensorFlow:Can';t pickle局部对象循环,apache-spark,tensorflow,parallel-processing,pyspark,pickle,Apache Spark,Tensorflow,Parallel Processing,Pyspark,Pickle,我有一个独立的Spark集群,并尝试使用Python在其上运行。到目前为止,我只尝试了非常简单的示例,但我总是遇到相同的问题:每个工作人员都会出现相同的错误消息: AttributeError: Can't pickle local object 'start.<locals>.<lambda>' 似乎这个问题是,但不存在解决方案。任何提示都将不胜感激,因为我没有想法 17/08/15 16:40:36 ERROR Executor: Exception in task
AttributeError: Can't pickle local object 'start.<locals>.<lambda>'
似乎这个问题是,但不存在解决方案。任何提示都将不胜感激,因为我没有想法
17/08/15 16:40:36 ERROR Executor: Exception in task 0.2 in stage 0.0 (TID 5)
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "D:\Spark\python\lib\pyspark.zip\pyspark\worker.py", line 177, in main
File "D:\Spark\python\lib\pyspark.zip\pyspark\worker.py", line 172, in process
File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 2423, in pipeline_func
return func(split, prev_func(split, iterator))
File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 2423, in pipeline_func
return func(split, prev_func(split, iterator))
File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 2423, in pipeline_func
return func(split, prev_func(split, iterator))
File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 346, in func
return f(iterator)
File "C:\Program Files\Anaconda3\lib\site-packages\pyspark\rdd.py", line 794, in func
r = f(it)
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflowonspark\TFSparkNode.py", line 290, in _mapfn
TFSparkNode.mgr = TFManager.start(authkey, ['control'], 'remote')
File "C:\Program Files\Anaconda3\lib\site-packages\tensorflowonspark\TFManager.py", line 41, in start
mgr.start()
File "C:\Program Files\Anaconda3\lib\multiprocessing\managers.py", line 513, in start
self._process.start()
File "C:\Program Files\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'start.<locals>.<lambda>'
at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)