Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/google-app-engine/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python工作程序出错:/usr/bin/python没有名为pyspark的模块_Python_Hadoop_Apache Spark_Pyspark_Biginsights - Fatal编程技术网

python工作程序出错:/usr/bin/python没有名为pyspark的模块

python工作程序出错:/usr/bin/python没有名为pyspark的模块,python,hadoop,apache-spark,pyspark,biginsights,Python,Hadoop,Apache Spark,Pyspark,Biginsights,我试图在Thread上运行Pyspark,但在控制台上键入任何命令时,我收到以下错误 我能够在Spark中以本地模式和纱线模式运行scala shell。 Pyspark在本地模式下运行良好,但在纱线模式下不工作 OS:RHEL 6.x Hadoop发行版:IBM BigInsights 4.0 Spark版本:1.2.1 WARN scheduler.TaskSetManager:在阶段0.0中丢失了任务0.0(TID 0,工作):org.apache.spark.sparkeExceptio

我试图在Thread上运行Pyspark,但在控制台上键入任何命令时,我收到以下错误

我能够在Spark中以本地模式和纱线模式运行scala shell。 Pyspark在本地模式下运行良好,但在纱线模式下不工作

OS:RHEL 6.x

Hadoop发行版:IBM BigInsights 4.0

Spark版本:1.2.1

WARN scheduler.TaskSetManager:在阶段0.0中丢失了任务0.0(TID 0,工作):org.apache.spark.sparkeException: python工作程序出错: /usr/bin/python:没有名为pyspark的模块 蟒蛇是: /mnt/sdj1/hadoop/thread/local/filecache/13/spark-assembly.jar(我的评论:这个路径在linux文件系统上不存在,但在逻辑数据节点上存在) java.io.EOFException 位于java.io.DataInputStream.readInt(DataInputStream.java:392) 位于org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163) 位于org.apache.spark.api.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86) 位于org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) 位于org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102) 位于org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) 在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:280)上 位于org.apache.spark.rdd.rdd.iterator(rdd.scala:247) 位于org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) 位于org.apache.spark.scheduler.Task.run(Task.scala:56) 位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:200) 位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 运行(Thread.java:745)

我通过导出命令设置了SPARK_HOME和PYTHONPATH,如下所示

export SPARK_HOME=/path/to/spark
export PYTHONPATH=/path/to/spark/python/:/path/to/spark/lib/spark-assembly.jar
有人能帮我解决这个问题吗

答复: 经过一些挖掘,我发现pyspark在Big Insights 4.0开箱即用中确实存在一些问题。有人建议我们升级到BI 4.1。