Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
ApacheSpark-Kafka-Java消息的并行处理 JavaPairReceiverInputStream消息=KafkaUtils.createStream(…); JavaPairDStream filteredMessages=filterValidMessages(消息); JavaDStream useCase1=calculateUseCase1(filteredMessages); JavaDStream useCase2=calculateUseCase2(filteredMessages); JavaDStream useCase3=calculateUseCase3(filteredMessages); JavaDStream useCase4=calculateUseCase4(filteredMessages);_Java_Apache Spark_Apache Kafka_Spark Streaming - Fatal编程技术网

ApacheSpark-Kafka-Java消息的并行处理 JavaPairReceiverInputStream消息=KafkaUtils.createStream(…); JavaPairDStream filteredMessages=filterValidMessages(消息); JavaDStream useCase1=calculateUseCase1(filteredMessages); JavaDStream useCase2=calculateUseCase2(filteredMessages); JavaDStream useCase3=calculateUseCase3(filteredMessages); JavaDStream useCase4=calculateUseCase4(filteredMessages);

ApacheSpark-Kafka-Java消息的并行处理 JavaPairReceiverInputStream消息=KafkaUtils.createStream(…); JavaPairDStream filteredMessages=filterValidMessages(消息); JavaDStream useCase1=calculateUseCase1(filteredMessages); JavaDStream useCase2=calculateUseCase2(filteredMessages); JavaDStream useCase3=calculateUseCase3(filteredMessages); JavaDStream useCase4=calculateUseCase4(filteredMessages);,java,apache-spark,apache-kafka,spark-streaming,Java,Apache Spark,Apache Kafka,Spark Streaming,我从卡夫卡检索消息,过滤这些消息,并将相同的消息用于多个用例。这里用例1到4相互独立,可以并行计算。然而,当我查看日志时,我看到计算是按顺序进行的。如何使它们并行运行。任何建议都会有帮助。尝试为4个用例中的每一个创建卡夫卡主题。然后尝试创建4个不同的Kafka数据流。我将所有代码移动到for循环中,并在Kafka主题中按分区数进行迭代,我看到了改进 JavaPairReceiverInputDStream<String, byte[]> messages = KafkaUtils.c


我从卡夫卡检索消息,过滤这些消息,并将相同的消息用于多个用例。这里用例1到4相互独立,可以并行计算。然而,当我查看日志时,我看到计算是按顺序进行的。如何使它们并行运行。任何建议都会有帮助。

尝试为4个用例中的每一个创建卡夫卡主题。然后尝试创建4个不同的Kafka数据流。

我将所有代码移动到for循环中,并在Kafka主题中按分区数进行迭代,我看到了改进

JavaPairReceiverInputDStream<String, byte[]> messages = KafkaUtils.createStream(...);
JavaPairDStream<String, byte[]> filteredMessages = filterValidMessages(messages);

JavaDStream<String> useCase1 = calculateUseCase1(filteredMessages);
JavaDStream<String> useCase2 = calculateUseCase2(filteredMessages);
JavaDStream<String> useCase3 = calculateUseCase3(filteredMessages);
JavaDStream<String> useCase4 = calculateUseCase4(filteredMessages);

用于(i=0;我感谢你。但是,这是相同的消息,用例将以不同的方式使用它。我会仔细检查你的解决方案,因为不同的消息将根据消息的键转到不同的分区。在这种情况下,我有相同的消息,第一个用例去掉国家,第二个用例去掉国家类型和骨料等。
for(i=0;i<numOfPartitions;i++)
{
JavaPairReceiverInputDStream<String, byte[]> messages =
KafkaUtils.createStream(...);
JavaPairDStream<String, byte[]> filteredMessages =
filterValidMessages(messages);

JavaDStream<String> useCase1 = calculateUseCase1(filteredMessages);
JavaDStream<String> useCase2 = calculateUseCase2(filteredMessages);
JavaDStream<String> useCase3 = calculateUseCase3(filteredMessages);
JavaDStream<String> useCase4 = calculateUseCase4(filteredMessages);
}