Apache spark Jupyter上的pyspark内核生成;“未找到火花”;错误

Apache spark Jupyter上的pyspark内核生成;“未找到火花”;错误,apache-spark,pyspark,jupyter-notebook,Apache Spark,Pyspark,Jupyter Notebook,我有一些已经工作了几个月的pysparkkerneljupyter笔记本,但最近已经不工作了。pyspark内核本身正在工作:它发出蓝色消息: Kernel Loaded 。。我们可以看到内核是可用的: 但我在jupyter的日志中注意到了这一点: [IPKernelApp]警告|处理PYTHONSTARTUP文件/shared/spark/python/pyspark/shell.py时出现未知错误: 当尝试在spark中执行某些工作时,我们得到: ---> 18 d

我有一些已经工作了几个月的
pyspark
kernel
jupyter
笔记本,但最近已经不工作了。
pyspark
内核本身正在工作:它发出蓝色消息:

    Kernel Loaded
。。我们可以看到内核是可用的:

但我在jupyter的日志中注意到了这一点:

[IPKernelApp]警告|处理PYTHONSTARTUP文件/shared/spark/python/pyspark/shell.py时出现未知错误:

当尝试在
spark
中执行某些工作时,我们得到:

---> 18     df = spark.read.parquet(path)
     19     if count: p(tname + ": count="+str(df.count()))
     20     df.createOrReplaceTempView(tname)

NameError: name 'spark' is not defined
没有进一步的信息

注意:使用
toree
scala
spark内核能够成功地通过拼花地板读取相同的文件(实际上使用相同的代码)


那么,
jupyter-pyspark
内核会发生什么呢?

明白了!我升级了
spark
,而
pyspark
内核并不知道这一点

第一:安装了哪些
内核

$jupyter kernelspec list

Available kernels:
  python2        /Users/sboesch/Library/Python/2.7/lib/python/site-packages/ipykernel/resources
  ir             /Users/sboesch/Library/Jupyter/kernels/ir
  julia-1.0      /Users/sboesch/Library/Jupyter/kernels/julia-1.0
  scala          /Users/sboesch/Library/Jupyter/kernels/scala
  scijava        /Users/sboesch/Library/Jupyter/kernels/scijava
  pyspark        /usr/local/share/jupyter/kernels/pyspark
  spark_scala    /usr/local/share/jupyter/kernels/spark_scala
让我们检查一下
pyspark
内核:

sudo vim  /usr/local/share/jupyter/kernels/pyspark/kernel.json
特别有趣的是
spark
jar文件:

PYTHONPATH="/shared/spark/python/:/shared/spark/python/lib/py4j-0.10.4-src.zip"
有空吗

$ll "/shared/spark/python/:/shared/spark/python/lib/py4j-0.10.4-src.zip"
ls: /shared/spark/python/:/shared/spark/python/lib/py4j-0.10.4-src.zip: No such file or directory
不,它不是-所以让我们更新该路径:

 $ll /shared/spark/python/lib/py4j*
-rw-r--r--@ 1 sboesch  wheel  42437 Jun  1 13:49 /shared/spark/python/lib/py4j-0.10.7-src.zip


PYTHONPATH="/shared/spark/python/:/shared/spark/python/lib/py4j-0.10.7-src.zip"
在此之后,我重新启动了
jupyter
,并且
pyspark
内核正在工作