Apache spark 带有Spark Magic的Jypyter中出现Livy Pypark Python会话错误-错误repl.PythonInterpreter:进程已因1而终止

Apache spark 带有Spark Magic的Jypyter中出现Livy Pypark Python会话错误-错误repl.PythonInterpreter:进程已因1而终止,apache-spark,pyspark,jupyter,livy,Apache Spark,Pyspark,Jupyter,Livy,我正在运行spark v2.0.0纱线簇。我让利维在火花大师旁边跑 我已经设置了一个jupyter Python3笔记本,并且已经安装了,尽管当我创建会话时,我从笔记本上收到了一条错误消息 Added endpoint http://spark-master:8998 Starting Spark application ID YARN Application ID Kind State Spark UI Driver log Current session? 0 No

我正在运行spark v2.0.0纱线簇。我让利维在火花大师旁边跑

我已经设置了一个jupyter Python3笔记本,并且已经安装了,尽管当我创建会话时,我从笔记本上收到了一条错误消息

Added endpoint http://spark-master:8998
Starting Spark application

ID  YARN Application ID Kind    State   Spark UI    Driver log  Current session?
0   None    pyspark idle            ✔
---------------------------------------------------------------------------
LivyUnexpectedStatusException             Traceback (most recent call last)
/opt/conda/lib/python3.5/site-packages/hdijupyterutils/ipywidgetfactory.py in submit_clicked(self, button)
     63 
     64     def submit_clicked(self, button):
---> 65         self.parent_widget.run()

/opt/conda/lib/python3.5/site-packages/sparkmagic/controllerwidget/createsessionwidget.py in run(self)
     56 
     57         try:
---> 58             self.spark_controller.add_session(alias, endpoint, skip, properties)
     59         except ValueError as e:
     60             self.ipython_display.send_error("""Could not add session with

/opt/conda/lib/python3.5/site-packages/sparkmagic/livyclientlib/sparkcontroller.py in add_session(self, name, endpoint, skip_if_exists, properties)
     79         session = self._livy_session(http_client, properties, self.ipython_display)
     80         self.session_manager.add_session(name, session)
---> 81         session.start()
     82 
     83     def get_session_id_for_client(self, name):

/opt/conda/lib/python3.5/site-packages/sparkmagic/livyclientlib/livysession.py in start(self)
    148             else:
    149                 command = Command("sqlContext")
--> 150                 (success, out) = command.execute(self)
    151                 if success:
    152                     self.ipython_display.writeln(u"SparkContext available as 'sc'.")

/opt/conda/lib/python3.5/site-packages/sparkmagic/livyclientlib/command.py in execute(self, session)
     29         statement_id = -1
     30         try:
---> 31             session.wait_for_idle()
     32             data = {u"code": self.code}
     33             response = session.http_client.post_statement(session.id, data)

/opt/conda/lib/python3.5/site-packages/sparkmagic/livyclientlib/livysession.py in wait_for_idle(self, seconds_to_wait)
    238                     .format(self.id, self.status)
    239                 self.logger.error(error)
--> 240                 raise LivyUnexpectedStatusException(u'{} See logs:\n{}'.format(error, self.get_logs()))
    241 
    242             if seconds_to_wait <= 0.0:

LivyUnexpectedStatusException: Session 0 unexpectedly reached final status 'error'. See logs:
并在livy日志中获取此输出

我无法确定确切的问题/解决方法是什么。如果我将会话设置为使用Scala语言而不是Python,我就能够创建成功的连接。尽管只有将会话语言设置为python时才会出现错误。如果有人知道在Jupyter连接livy repl pyspark会话的解决方案,请让我知道

更新 Livy仍然无法创建PySpark会话

curl -v -X POST --data '{"kind": "pyspark"}' -H "Content-Type: application/json" example.com/sessions
会话状态将直接从“开始”变为“失败”。在livy会话失败之前,资源管理器上的纱线日志提供以下权限

To adjust logging level use sc.setLogLevel(newLevel).
17/02/26 05:02:25 WARN rsc.RSCConf: Your hostname, yarn-slave1, resolves to a loopback address, but we couldn't find any external IP address!
17/02/26 05:02:25 WARN rsc.RSCConf: Set livy.rsc.rpc.server.address if you need to bind to another address.
17/02/26 05:02:32 ERROR repl.PythonInterpreter: Process has died with 1
17/02/26 05:02:33 WARN yarn.YarnAllocator: Container marked as failed: container_1488085279373_0001_01_000002 on host: yarn-slave1. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1488085279373_0001_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
17/02/26 05:02:33 WARN yarn.ApplicationMaster: Reporter thread fails 1 time(s) in a row.
java.lang.IllegalStateException: RpcEnv already stopped.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:159)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
    at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:185)
    at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:508)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:531)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:512)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:512)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:442)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at org.apache.spark.deploy.yarn.YarnAllocator.processCompletedContainers(YarnAllocator.scala:442)
    at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:242)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:372)
17/02/26 05:02:40 WARN yarn.YarnAllocator: Container marked as failed: container_1488085279373_0001_01_000005 on host: yarn-slave1. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1488085279373_0001_01_000005
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
17/02/26 05:02:40 WARN yarn.ApplicationMaster: Reporter thread fails 1 time(s) in a row.
java.lang.IllegalStateException: RpcEnv already stopped.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:159)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
    at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:185)
    at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:508)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:531)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:512)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:512)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:442)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at org.apache.spark.deploy.yarn.YarnAllocator.processCompletedContainers(YarnAllocator.scala:442)
    at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:242)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:372)
17/02/26 05:02:47 WARN yarn.YarnAllocator: Container marked as failed: container_1488085279373_0001_01_000006 on host: yarn-slave1. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1488085279373_0001_01_000006
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
17/02/26 05:02:47 WARN yarn.ApplicationMaster: Reporter thread fails 1 time(s) in a row.
java.lang.IllegalStateException: RpcEnv already stopped.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:159)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
    at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:185)
    at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:508)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:531)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:512)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:512)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:442)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at org.apache.spark.deploy.yarn.YarnAllocator.processCompletedContainers(YarnAllocator.scala:442)
    at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:242)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:372)
17/02/26 05:02:53 WARN yarn.YarnAllocator: Container marked as failed: container_1488085279373_0001_01_000007 on host: yarn-slave1. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1488085279373_0001_01_000007
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
    at org.apache.hadoop.util.Shell.run(Shell.java:479)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
17/02/26 05:02:53 WARN yarn.ApplicationMaster: Reporter thread fails 1 time(s) in a row.
java.lang.IllegalStateException: RpcEnv already stopped.
    at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:159)
    at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
    at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:185)
    at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:508)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:531)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1$$anonfun$apply$7.apply(YarnAllocator.scala:512)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:512)
    at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$processCompletedContainers$1.apply(YarnAllocator.scala:442)
    at scala.collection.Iterator$class.foreach(Iterator.scala:742)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at org.apache.spark.deploy.yarn.YarnAllocator.processCompletedContainers(YarnAllocator.scala:442)
    at org.apache.spark.deploy.yarn.YarnAllocator.allocateResources(YarnAllocator.scala:242)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:372)
spark defaults.conf

spark.yarn.appMasterEnv.PYSPARK_PYTHON python2
livy.server.host = 0.0.0.0
livy.server.port = 8998
livy.spark.master = yarn
livy.spark.deployMode = cluster
核心站点.xml

<property>
  <name>hadoop.proxyuser.livy.groups</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.livy.hosts</name>
  <value>*</value>
</property>

我能够复制这个问题


问题似乎在于spark 2.0.0和livy的pyspark版本不兼容。如果将spark更新为最新版本(当前为2.1.0),pyspark版本可以通信,spark会话创建时不会出现任何问题。

我能够重现此问题


问题似乎在于spark 2.0.0和livy的pyspark版本不兼容。如果您将spark更新到最新版本(当前为2.1.0),pyspark版本可以通信,spark会话的创建不会出现任何问题。

即使使用spark 2.1.1和livy,我也遇到过类似的问题。Livy会话状态从“开始”变为“错误”。原来我在用Java-7,而Livy和Spark需要Java-8。解决了我的问题。

即使使用spark 2.1.1和livy,我也遇到过类似的问题。Livy会话状态从“开始”变为“错误”。原来我在用Java-7,而Livy和Spark需要Java-8。解决了我的问题。

我也面临着类似的问题。原来凶手是livy版本的。当用ApacheLivy-0.6.0-Cubating版本替换ClouderaLivy时,问题得到了解决;我能够在livy上创建pyspark类会话。

我也面临着类似的问题。原来凶手是livy版本的。当用ApacheLivy-0.6.0-Cubating版本替换ClouderaLivy时,问题得到了解决;我还能够在livy上创建Pypark类会话。

似乎解决了这个问题。谢谢似乎解决了这个问题。谢谢
livy.server.host = 0.0.0.0
livy.server.port = 8998
livy.spark.master = yarn
livy.spark.deployMode = cluster