Apache spark 火花常数IndexOutOfBoundsException警告

Apache spark 火花常数IndexOutOfBoundsException警告,apache-spark,pyspark,Apache Spark,Pyspark,我正在EMR大型集群上运行Spark(master.type=r5.4xlagle,core.count=150,core.type=r5.4xlagle)。幸运的是,工作结束了,但它不断地发出这样的警告: 20/04/30 14:30:58 INFO TaskSetManager: Finished task 4002.0 in stage 171.0 (TID 1024115) in 22083 ms on ip-10-1-56-55.cloud-internal.rovio.com (ex

我正在EMR大型集群上运行Spark(
master.type=r5.4xlagle,core.count=150,core.type=r5.4xlagle
)。幸运的是,工作结束了,但它不断地发出这样的警告:

20/04/30 14:30:58 INFO TaskSetManager: Finished task 4002.0 in stage 171.0 (TID 1024115) in 22083 ms on ip-10-1-56-55.cloud-internal.rovio.com (executor 195) (4261/10000)
20/04/30 14:30:58 WARN ServletHandler: 
javax.servlet.ServletException: java.lang.IndexOutOfBoundsException: 5226
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
    at org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
    at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
    at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166)
    at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
    at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
    at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
    at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
    at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
    at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
    at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
    at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
    at org.spark_project.jetty.server.Server.handle(Server.java:539)
    at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333)
    at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
    at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
    at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
    at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
    at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
    at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
    at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
    at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
    at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: 5226
    at scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132)
    at scala.collection.immutable.Vector.apply(Vector.scala:122)
    at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:295)
    at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:294)
    at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:294)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246)
    at org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:294)
    at org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:327)
    at org.apache.spark.status.api.v1.StagesResource$$anonfun$taskSummary$1.apply(StagesResource.scala:90)
    at org.apache.spark.status.api.v1.StagesResource$$anonfun$taskSummary$1.apply(StagesResource.scala:80)
    at org.apache.spark.status.api.v1.BaseAppResource$$anonfun$withUI$1.apply(ApiRootResource.scala:140)
    at org.apache.spark.status.api.v1.BaseAppResource$$anonfun$withUI$1.apply(ApiRootResource.scala:135)
    at org.apache.spark.ui.SparkUI.withSparkUI(SparkUI.scala:106)
    at org.apache.spark.status.api.v1.BaseAppResource$class.withUI(ApiRootResource.scala:135)
    at org.apache.spark.status.api.v1.StagesResource.withUI(StagesResource.scala:30)
    at org.apache.spark.status.api.v1.StagesResource.taskSummary(StagesResource.scala:80)
    at sun.reflect.GeneratedMethodAccessor240.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
    at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
    ... 28 more
20/04/30 14:30:58 WARN HttpChannel: //ip-10-1-56-118.cloud-internal.rovio.com:4040/api/v1/applications/application_1588233114567_0003/stages/165/0/taskSummary?proxyapproved=true
javax.servlet.ServletException: java.lang.IndexOutOfBoundsException: 5226
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
    at org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
    at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
    at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:166)
    at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
    at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
    at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
    at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
    at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
    at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
    at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
    at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
    at org.spark_project.jetty.server.Server.handle(Server.java:539)
    at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333)
    at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
    at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
    at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
    at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
    at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
    at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
    at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
    at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
    at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: 5226
    at scala.collection.immutable.Vector.checkRangeConvert(Vector.scala:132)
    at scala.collection.immutable.Vector.apply(Vector.scala:122)
    at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply$mcDJ$sp(AppStatusStore.scala:295)
    at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:294)
    at org.apache.spark.status.AppStatusStore$$anonfun$scanTasks$1$1.apply(AppStatusStore.scala:294)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofLong.foreach(ArrayOps.scala:246)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.mutable.ArrayOps$ofLong.map(ArrayOps.scala:246)
    at org.apache.spark.status.AppStatusStore.scanTasks$1(AppStatusStore.scala:294)
    at org.apache.spark.status.AppStatusStore.taskSummary(AppStatusStore.scala:327)
    at org.apache.spark.status.api.v1.StagesResource$$anonfun$taskSummary$1.apply(StagesResource.scala:90)
    at org.apache.spark.status.api.v1.StagesResource$$anonfun$taskSummary$1.apply(StagesResource.scala:80)
    at org.apache.spark.status.api.v1.BaseAppResource$$anonfun$withUI$1.apply(ApiRootResource.scala:140)
    at org.apache.spark.status.api.v1.BaseAppResource$$anonfun$withUI$1.apply(ApiRootResource.scala:135)
    at org.apache.spark.ui.SparkUI.withSparkUI(SparkUI.scala:106)
    at org.apache.spark.status.api.v1.BaseAppResource$class.withUI(ApiRootResource.scala:135)
    at org.apache.spark.status.api.v1.StagesResource.withUI(StagesResource.scala:30)
    at org.apache.spark.status.api.v1.StagesResource.taskSummary(StagesResource.scala:80)
    at sun.reflect.GeneratedMethodAccessor240.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
    at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205)
    at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
    at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
    at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
    ... 28 more
20/04/30 14:30:58 INFO TaskSetManager: Starting task 5461.0 in stage 171.0 (TID 1025574, ip-10-1-56-113.cloud-internal.rovio.com, executor 192, partition 5461, PROCESS_LOCAL, 7767 bytes)
20/04/30 14:30:58 INFO TaskSetManager: Finished task 4476.0 in stage 171.0 (TID 1024589) in 15664 ms on ip-10-1-56-113.cloud-internal.rovio.com (executor 192) (4262/10000)
20/04/30 14:30:58 INFO TaskSetManager: Starting task 5462.0 in stage 171.0 (TID 1025575, ip-10-1-56-179.cloud-internal.rovio.com, executor 45, partition 5462, PROCESS_LOCAL, 7767 bytes)
20/04/30 14:30:58 INFO TaskSetManager: Finished task 3826.0 in s
  • 这些警告是否有已知的原因
  • 有没有办法进一步调试这些问题的根本原因是什么
  • 这些警告是失败或不稳定的原因吗

你有没有发现这是什么原因造成的?我遇到了同样的问题