Pyspark py4j.Py4JException:获取新通信通道时出错

Pyspark py4j.Py4JException:获取新通信通道时出错,pyspark,py4j,Pyspark,Py4j,pyspark程序开始运行时没有问题。 如果让作业长时间运行,则会出现py4j错误 2020-09-11 06:55:00 ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectio

pyspark程序开始运行时没有问题。 如果让作业长时间运行,则会出现py4j错误

2020-09-11 06:55:00 ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
py4j.Py4JException: Error while obtaining a new communication channel
        at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
        at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
        at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
        at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
        at com.sun.proxy.$Proxy17.call(Unknown Source)
        at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
        at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
        at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
        at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
        at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
        at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
        at scala.util.Try$.apply(Try.scala:192)
        at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
        at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:244)
        at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
        at py4j.CallbackConnection.start(CallbackConnection.java:226)
        at py4j.CallbackClient.getConnection(CallbackClient.java:238)
        at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
        ... 25 more

2020-09-11 06:55:00错误作业调度器:91-运行作业流作业1599762320000毫秒时出错
py4j.Py4JException:获取新通信通道时出错
位于py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
位于py4j.CallbackClient.sendCommand(CallbackClient.java:377)
位于py4j.CallbackClient.sendCommand(CallbackClient.java:356)
位于py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
位于com.sun.proxy.$Proxy17.call(未知源)
位于org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
位于org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
位于org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
位于org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
在org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
在org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
在org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
位于org.apache.spark.streaming.dstream.dstream.createRDDWithLocalProperties(dstream.scala:416)
在org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
在org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply上(ForEachDStream.scala:50)
在org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply上(ForEachDStream.scala:50)
在scala.util.Try$.apply(Try.scala:192)
位于org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
在org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
位于org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
位于org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
在scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)中
位于org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
运行(Thread.java:748)
原因:java.net.ConnectException:连接被拒绝(连接被拒绝)
位于java.net.PlainSocketImpl.socketConnect(本机方法)
位于java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
位于java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
位于java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
位于java.net.socksocketimpl.connect(socksocketimpl.java:392)
位于java.net.Socket.connect(Socket.java:589)
位于java.net.Socket.connect(Socket.java:538)
位于java.net.Socket。(Socket.java:434)
位于java.net.Socket(Socket.java:244)
位于javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
在py4j.CallbackConnection.start处(CallbackConnection.java:226)
位于py4j.CallbackClient.getConnection(CallbackClient.java:238)
位于py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
... 25多
我在stackoverflow中发现了三个类似的问题,但没有答案