Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 从大火花数据帧到H2O数据帧的H2O起泡水错误_Apache Spark_H2o_Sparklyr_Sparkling Water - Fatal编程技术网

Apache spark 从大火花数据帧到H2O数据帧的H2O起泡水错误

Apache spark 从大火花数据帧到H2O数据帧的H2O起泡水错误,apache-spark,h2o,sparklyr,sparkling-water,Apache Spark,H2o,Sparklyr,Sparkling Water,当我尝试从spark数据帧转换到H2O数据帧时,我得到了下面的错误。这似乎与数据帧的大小有关,因为当我将其缩小时,spark和H2O之间的转换器工作良好 为了使用起泡水将大火花数据帧转换为H2O,是否需要更改任何配置?在我的配置中,我允许驱动程序和执行器使用最大内存,因此这不是内存问题 我在这里使用R代码是: training<-as_h2o_frame(sc, final1, strict_version_check = FALSE) 要转发Jakub的评论,以便更容易找到: 您的H2

当我尝试从spark数据帧转换到H2O数据帧时,我得到了下面的错误。这似乎与数据帧的大小有关,因为当我将其缩小时,spark和H2O之间的转换器工作良好

为了使用起泡水将大火花数据帧转换为H2O,是否需要更改任何配置?在我的配置中,我允许驱动程序和执行器使用最大内存,因此这不是内存问题

我在这里使用R代码是:

training<-as_h2o_frame(sc, final1, strict_version_check = FALSE)

要转发Jakub的评论,以便更容易找到:


您的H2O云似乎没有正确初始化。请在github.com/h2oai/rsparkling#spark connection查看自述文件

您的H2O云似乎未正确初始化。请在这里查看自述
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 95.1 failed 4 times, most recent failure: Lost task 4.3 in stage 95.1 (TID 4050, 10.0.0.9): java.lang.ArrayIndexOutOfBoundsException: 65535
                at water.DKV.get(DKV.java:202)
                at water.DKV.get(DKV.java:175)
                at water.Key.get(Key.java:83)
                at water.fvec.Frame.createNewChunks(Frame.java:896)
                at water.fvec.FrameUtils$class.createNewChunks(FrameUtils.scala:43)
                at water.fvec.FrameUtils$.createNewChunks(FrameUtils.scala:70)
                at org.apache.spark.h2o.backends.internal.InternalWriteConverterCtx.createChunks(InternalWriteConverterCtx.scala:29)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$.org$apache$spark$h2o$converters$SparkDataFrameConverter$$perSQLPartition(SparkDataFrameConverter.scala:95)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$$anonfun$toH2OFrame$1$$anonfun$apply$2.apply(SparkDataFrameConverter.scala:74)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$$anonfun$toH2OFrame$1$$anonfun$apply$2.apply(SparkDataFrameConverter.scala:74)
                at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
                at org.apache.spark.scheduler.Task.run(Task.scala:86)
                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
                at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
                at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
                at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
                at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
                at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
                at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
                at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
                at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
                at scala.Option.foreach(Option.scala:257)
                at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
                at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667)
                at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
                at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
                at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
                at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
                at org.apache.spark.SparkContext.runJob(SparkContext.scala:1873)
                at org.apache.spark.SparkContext.runJob(SparkContext.scala:1886)
                at org.apache.spark.SparkContext.runJob(SparkContext.scala:1906)
                at org.apache.spark.h2o.converters.WriteConverterCtxUtils$.convert(WriteConverterCtxUtils.scala:83)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$.toH2OFrame(SparkDataFrameConverter.scala:74)
                at org.apache.spark.h2o.H2OContext.asH2OFrame(H2OContext.scala:145)
                at org.apache.spark.h2o.H2OContext.asH2OFrame(H2OContext.scala:143)
                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                at java.lang.reflect.Method.invoke(Method.java:498)
                at sparklyr.Invoke$.invoke(invoke.scala:102)
                at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
                at sparklyr.StreamHandler$.read(stream.scala:54)
                at sparklyr.BackendHandler.channelRead0(handler.scala:49)
                at sparklyr.BackendHandler.channelRead0(handler.scala:14)
                at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
                at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
                at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
                at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
                at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
                at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
                at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
                at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
                at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
                at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
                at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
                at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
                at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
                at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
                at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
                at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
                at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 65535
                at water.DKV.get(DKV.java:202)
                at water.DKV.get(DKV.java:175)
                at water.Key.get(Key.java:83)
                at water.fvec.Frame.createNewChunks(Frame.java:896)
                at water.fvec.FrameUtils$class.createNewChunks(FrameUtils.scala:43)
                at water.fvec.FrameUtils$.createNewChunks(FrameUtils.scala:70)
                at org.apache.spark.h2o.backends.internal.InternalWriteConverterCtx.createChunks(InternalWriteConverterCtx.scala:29)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$.org$apache$spark$h2o$converters$SparkDataFrameConverter$$perSQLPartition(SparkDataFrameConverter.scala:95)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$$anonfun$toH2OFrame$1$$anonfun$apply$2.apply(SparkDataFrameConverter.scala:74)
                at org.apache.spark.h2o.converters.SparkDataFrameConverter$$anonfun$toH2OFrame$1$$anonfun$apply$2.apply(SparkDataFrameConverter.scala:74)
                at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
                at org.apache.spark.scheduler.Task.run(Task.scala:86)
                at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
                ... 1 more