Apache spark Spark 2.3提交Kubernetes错误
尝试在k8集群上运行spark submit时出现以下错误 错误1:这看起来像是一个警告,它不会中断executor pod内运行的应用程序,但会持续收到此警告Apache spark Spark 2.3提交Kubernetes错误,apache-spark,kubernetes,Apache Spark,Kubernetes,尝试在k8集群上运行spark submit时出现以下错误 错误1:这看起来像是一个警告,它不会中断executor pod内运行的应用程序,但会持续收到此警告 2018-03-09 11:15:21 WARN WatchConnectionManager:192 - Exec Failure java.io.EOFException at okio.RealBufferedSource.require(RealBufferedSource.java:60) at
2018-03-09 11:15:21 WARN WatchConnectionManager:192 - Exec Failure
java.io.EOFException
at okio.RealBufferedSource.require(RealBufferedSource.java:60)
at okio.RealBufferedSource.readByte(RealBufferedSource.java:73)
at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:113)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:97)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:262)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:201)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
错误2:这是间歇性错误,导致执行器吊舱无法运行
org.apache.spark.SparkException: External scheduler cannot be instantiated
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:492)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
at com.capitalone.quantum.spark.core.QuantumSession$.initialize(QuantumSession.scala:62)
at com.capitalone.quantum.spark.core.QuantumSession$.getSparkSession(QuantumSession.scala:80)
at com.capitalone.quantum.workflow.WorkflowApp$.getSession(WorkflowApp.scala:116)
at com.capitalone.quantum.workflow.WorkflowApp$.main(WorkflowApp.scala:90)
at com.capitalone.quantum.workflow.WorkflowApp.main(WorkflowApp.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [myapp-ef79db3d9f4831bf85bda14145fdf113-driver-driver] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:70)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)
... 11 more
Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)
at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)
at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
at okhttp3.RealCall.execute(RealCall.java:69)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
... 15 more
2018-03-09 15:00:39 INFO AbstractConnector:318 - Stopped Spark@5f59185e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-03-09 15:00:39 INFO SparkUI:54 - Stopped Spark web UI at http://myapp-ef79db3d9f4831bf85bda14145fdf113-driver-svc.default.svc:4040
2018-03-09 15:00:39 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-03-09 15:00:39 INFO MemoryStore:54 - MemoryStore cleared
2018-03-09 15:00:39 INFO BlockManager:54 - BlockManager stopped
2018-03-09 15:00:39 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2018-03-09 15:00:39 WARN MetricsSystem:66 - Stopping a MetricsSystem that is not running
2018-03-09 15:00:39 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-03-09 15:00:39 INFO SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:492)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
at com.capitalone.quantum.spark.core.QuantumSession$.initialize(QuantumSession.scala:62)
at com.capitalone.quantum.spark.core.QuantumSession$.getSparkSession(QuantumSession.scala:80)
at com.capitalone.quantum.workflow.WorkflowApp$.getSession(WorkflowApp.scala:116)
at com.capitalone.quantum.workflow.WorkflowApp$.main(WorkflowApp.scala:90)
at com.capitalone.quantum.workflow.WorkflowApp.main(WorkflowApp.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [myapp-ef79db3d9f4831bf85bda14145fdf113-driver] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:70)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)
... 11 more
Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)
at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)
at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
at okhttp3.RealCall.execute(RealCall.java:69)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
... 15 more
2018-03-09 15:00:39 INFO ShutdownHookManager:54 - Shutdown hook called
2018-03-09 15:00:39 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-5bd85c96-d689-4c53-a0b3-1eadd32357cb
org.apache.spark.SparkException:无法实例化外部计划程序
位于org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)
位于org.apache.spark.SparkContext(SparkContext.scala:492)
位于org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
位于org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
位于org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
位于scala.Option.getOrElse(Option.scala:121)
位于org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
位于com.capitalone.quantum.spark.core.QuantumSession$.initialize(QuantumSession.scala:62)
在com.capitalone.quantum.spark.core.QuantumSession$.getSparkSession(QuantumSession.scala:80)上
在com.capitalone.quantum.workflow.WorkflowApp$.getSession(WorkflowApp.scala:116)上
位于com.capitalone.quantum.workflow.WorkflowApp$.main(WorkflowApp.scala:90)
位于com.capitalone.quantum.workflow.WorkflowApp.main(WorkflowApp.scala)
由以下原因引起:io.fabric8.kubernetes.client.KubernetesClientException:命名空间:[default]中名为:[myapp-ef79db3d9f48311bf85bda14145fdf113-driver-driver]的种类:[Pod]的操作:[get]失败。
位于io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
位于io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
位于io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
位于io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
位于org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.(KubernetesClusterSchedulerBackend.scala:70)
位于org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)
位于org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)
... 还有11个
原因:java.net.UnknownHostException:kubernetes.default.svc:重试
位于java.net.Inet4AddressImpl.lookupAllHostAddr(本机方法)
位于java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
位于java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
位于java.net.InetAddress.getAllByName0(InetAddress.java:1276)
位于java.net.InetAddress.getAllByName(InetAddress.java:1192)
位于java.net.InetAddress.getAllByName(InetAddress.java:1126)
位于okhttp3.Dns$1.lookup(Dns.java:39)
在okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)中
在okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)
在okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)中
在okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)
在okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)中
位于okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
在okhttp3.internal.connection.ConnectionInterceptor.intercept(ConnectInterceptor.java:42)处
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:92)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:67)
在okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:92)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:67)
位于okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:92)
位于okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:92)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:67)
位于io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:92)
在okhttp3.internal.http.RealInterceptorChain.procedure(RealInterceptorChain.java:67)
在okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)上
在okhttp3.RealCall.execute(RealCall.java:69)
位于io.fabric8.kubernetes.client.dsl.base.OperationSupport.HandlerResponse(OperationSupport.java:377)
位于io.fabric8.kubernetes.client.dsl.base.OperationSupport.HandlerResponse(OperationSupport.java:343)
位于io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
位于io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
位于io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
位于io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
... 还有15个
2018-03-09 15:00:39信息摘要连接器:318-停止Spark@5f59185e{HTTP/1.1[HTTP/1.1]}{0.0.0.0:4040}
2018-03-09 15:00:39信息SparkUI:54-停止Spark web UIhttp://myapp-ef79db3d9f4831bf85bda14145fdf113-driver-svc.default.svc:4040
2018-03-09 15:00:39信息MapOutputRackerMasterEndpoint:54-MapOutputRackerMasterEndpoint已停止!
2018-03-09 15:00:39信息记忆存储:54-记忆存储已清除
2018-03-09 15:00:39信息区块管理器:54-区块管理器已停止
2018-03-09 15:00:39信息