Java Spark collect()网络故障

Java Spark collect()网络故障,java,apache-spark,netty,Java,Apache Spark,Netty,我刚从Spark开始。我正试图使用它为重复数据消除应用程序实现分布式处理。我现在正在处理的部分应该得到一个RDD列表,其中包含对,这是记录的列。这个过程应该是高度可并行的,但目前我正在处理local。当调试map函数中的所有内容时,当collect尝试执行时,尽管所有内容都中断了:(我不知道为什么,它甚至没有在集群上运行)。 这是代码的一部分,如果您需要了解更多信息,请告诉我: JavaRDD<Pair<String[],String[]>> rddData = java

我刚从Spark开始。我正试图使用它为重复数据消除应用程序实现分布式处理。我现在正在处理的部分应该得到一个RDD列表,其中包含
,这是记录的列。这个过程应该是高度可并行的,但目前我正在处理
local
。当调试map函数中的所有内容时,当collect尝试执行时,尽管所有内容都中断了:(我不知道为什么,它甚至没有在集群上运行)。 这是代码的一部分,如果您需要了解更多信息,请告诉我:

JavaRDD<Pair<String[],String[]>> rddData = javaSparkContext.parallelize(data.stream()
                .map(p->Pair.of(p.getLeft().fields(),p.getRight().fields()))
                .collect(Collectors.toList()),2);

        final int featureVecSize = featureCombinationToFeatureIndex.size();
        List<double[]> distributedArchFeatures = rddData.map(recordPair ->{
            String[] valuesA = recordPair.getLeft();
            String[] valuesB = recordPair.getRight();
            double[] scores = new double[featureVecSize];
            for(Map.Entry<Triple<Integer, ComparisonFeature, ComparisonModifier>, Integer> t : featureCombinationToFeatureIndex.entrySet()) {
                int index = t.getKey().getLeft();
                ComparisonFeature feature = t.getKey().getMiddle();
                ComparisonModifier modifier = t.getKey().getRight();
                double score = modifier.calculateScore(feature, valuesA[index], valuesB[index]);
                scores[t.getValue()] = score;
            }
            return scores;
        }).collect();

这是由于netty依赖冲突造成的。请检查项目的依赖关系树,并使用spark需要的依赖关系树

是的,它是netty版本,但这是对不同模块的依赖关系,因此我还不知道这是否是真正的原因。
[analyzerbeans-pool1-thread-27] INFO com.hi.identify7.execution.simple.SimpleMatchingExecutionContext - Scoring all matches...
[analyzerbeans-pool1-thread-27] INFO org.apache.spark.SparkContext - Starting job: collect at SimpleMatchingExecutionContext.java:177
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Got job 0 (collect at SimpleMatchingExecutionContext.java:177) with 2 output partitions
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Final stage: ResultStage 0 (collect at SimpleMatchingExecutionContext.java:177)
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Parents of final stage: List()
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Missing parents: List()
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SimpleMatchingExecutionContext.java:165), which has no missing parents
[dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0 stored as values in memory (estimated size 4.9 KB, free 2.5 GB)
[dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.7 KB, free 2.5 GB)
[dispatcher-event-loop-0] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory on 10.209.1.88:52600 (size: 2.7 KB, free: 2.5 GB)
[dag-scheduler-event-loop] INFO org.apache.spark.SparkContext - Created broadcast 0 from broadcast at DAGScheduler.scala:1006
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SimpleMatchingExecutionContext.java:165) (first 15 tasks are for partitions Vector(0, 1))
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.TaskSchedulerImpl - Adding task set 0.0 with 2 tasks
[dispatcher-event-loop-1] WARN org.apache.spark.scheduler.TaskSetManager - Stage 0 contains a task of very large size (11664 KB). The maximum recommended task size is 100 KB.
[dispatcher-event-loop-1] INFO org.apache.spark.scheduler.TaskSetManager - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 11944196 bytes)
[Executor task launch worker for task 0] INFO org.apache.spark.executor.Executor - Running task 0.0 in stage 0.0 (TID 0)
[Executor task launch worker for task 0] INFO org.apache.spark.storage.memory.MemoryStore - Block taskresult_0 stored as bytes in memory (estimated size 185.3 MB, free 2.3 GB)
[dispatcher-event-loop-1] INFO org.apache.spark.storage.BlockManagerInfo - Added taskresult_0 in memory on 10.209.1.88:52600 (size: 185.3 MB, free: 2.3 GB)
[Executor task launch worker for task 0] INFO org.apache.spark.executor.Executor - Finished task 0.0 in stage 0.0 (TID 0). 194298473 bytes result sent via BlockManager)
[dispatcher-event-loop-2] INFO org.apache.spark.scheduler.TaskSetManager - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 11932301 bytes)
[Executor task launch worker for task 1] INFO org.apache.spark.executor.Executor - Running task 1.0 in stage 0.0 (TID 1)
[task-result-getter-0] INFO org.apache.spark.network.client.TransportClientFactory - Successfully created connection to /10.209.1.88:52600 after 59 ms (0 ms spent in bootstraps)
[shuffle-client-4-1] ERROR org.apache.spark.network.client.TransportClient - Failed to send RPC 7173534709356817937 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
    at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
    at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
    at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:748)
[shuffle-client-4-1] ERROR org.apache.spark.network.shuffle.OneForOneBlockFetcher - Failed while starting block fetches
java.io.IOException: Failed to send RPC 7173534709356817937 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
    at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
    at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
    at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
    at io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:837)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:740)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
    at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
    at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
    at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    ... 17 more
[shuffle-client-4-1] INFO org.apache.spark.network.shuffle.RetryingBlockFetcher - Retrying fetch (1/3) for 1 outstanding blocks after 5000 ms
[Block Fetch Retry-6-1] INFO org.apache.spark.network.client.TransportClientFactory - Found inactive connection to /10.209.1.88:52600, creating a new one.
[Block Fetch Retry-6-1] INFO org.apache.spark.network.client.TransportClientFactory - Successfully created connection to /10.209.1.88:52600 after 1 ms (0 ms spent in bootstraps)
[shuffle-client-4-1] ERROR org.apache.spark.network.client.TransportClient - Failed to send RPC 6680513752030028512 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
    at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
    at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
    at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:748)
[shuffle-client-4-1] ERROR org.apache.spark.network.shuffle.OneForOneBlockFetcher - Failed while starting block fetches
java.io.IOException: Failed to send RPC 6680513752030028512 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
    at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
    at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
    at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
    at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
    at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
    at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
    at io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:837)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:740)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
    at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
    at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
    at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
    at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)