Apache spark Jupyter上的pyspark内核生成;“未找到火花”;错误
我有一些已经工作了几个月的Apache spark Jupyter上的pyspark内核生成;“未找到火花”;错误,apache-spark,pyspark,jupyter-notebook,Apache Spark,Pyspark,Jupyter Notebook,我有一些已经工作了几个月的pysparkkerneljupyter笔记本,但最近已经不工作了。pyspark内核本身正在工作:它发出蓝色消息: Kernel Loaded 。。我们可以看到内核是可用的: 但我在jupyter的日志中注意到了这一点: [IPKernelApp]警告|处理PYTHONSTARTUP文件/shared/spark/python/pyspark/shell.py时出现未知错误: 当尝试在spark中执行某些工作时,我们得到: ---> 18 d
pyspark
kerneljupyter
笔记本,但最近已经不工作了。pyspark
内核本身正在工作:它发出蓝色消息:
Kernel Loaded
。。我们可以看到内核是可用的:
但我在jupyter的日志中注意到了这一点:
[IPKernelApp]警告|处理PYTHONSTARTUP文件/shared/spark/python/pyspark/shell.py时出现未知错误:
当尝试在spark
中执行某些工作时,我们得到:
---> 18 df = spark.read.parquet(path)
19 if count: p(tname + ": count="+str(df.count()))
20 df.createOrReplaceTempView(tname)
NameError: name 'spark' is not defined
没有进一步的信息
注意:使用toree
的scala
spark内核能够成功地通过拼花地板读取相同的文件(实际上使用相同的代码)
那么,
jupyter-pyspark
内核会发生什么呢?明白了!我升级了spark
,而pyspark
内核并不知道这一点
第一:安装了哪些内核
:
$jupyter kernelspec list
Available kernels:
python2 /Users/sboesch/Library/Python/2.7/lib/python/site-packages/ipykernel/resources
ir /Users/sboesch/Library/Jupyter/kernels/ir
julia-1.0 /Users/sboesch/Library/Jupyter/kernels/julia-1.0
scala /Users/sboesch/Library/Jupyter/kernels/scala
scijava /Users/sboesch/Library/Jupyter/kernels/scijava
pyspark /usr/local/share/jupyter/kernels/pyspark
spark_scala /usr/local/share/jupyter/kernels/spark_scala
让我们检查一下pyspark
内核:
sudo vim /usr/local/share/jupyter/kernels/pyspark/kernel.json
特别有趣的是spark
jar文件:
PYTHONPATH="/shared/spark/python/:/shared/spark/python/lib/py4j-0.10.4-src.zip"
有空吗
$ll "/shared/spark/python/:/shared/spark/python/lib/py4j-0.10.4-src.zip"
ls: /shared/spark/python/:/shared/spark/python/lib/py4j-0.10.4-src.zip: No such file or directory
不,它不是-所以让我们更新该路径:
$ll /shared/spark/python/lib/py4j*
-rw-r--r--@ 1 sboesch wheel 42437 Jun 1 13:49 /shared/spark/python/lib/py4j-0.10.7-src.zip
PYTHONPATH="/shared/spark/python/:/shared/spark/python/lib/py4j-0.10.7-src.zip"
在此之后,我重新启动了jupyter
,并且pyspark
内核正在工作