Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 这个火花串输出是什么意思?_Java_Apache Spark_Spark Streaming - Fatal编程技术网

Java 这个火花串输出是什么意思?

Java 这个火花串输出是什么意思?,java,apache-spark,spark-streaming,Java,Apache Spark,Spark Streaming,在运行与spark集成的kafka项目时,我的输出量不足 我无法理解下面的输出说明了什么?我在eclipse中执行 我在任何地方都看不到生产商提供的数据 17/03/07 17:06:44 INFO JobScheduler: Starting job streaming job 1488886604000 ms.0 from job set of time 1488886604000 ms 17/03/07 17:06:44 INFO SparkContext: Starting jo

在运行与spark集成的kafka项目时,我的输出量不足

我无法理解下面的输出说明了什么?我在eclipse中执行

我在任何地方都看不到生产商提供的数据

17/03/07 17:06:44 INFO JobScheduler: Starting job streaming job 1488886604000 ms.0 from job set of time 1488886604000 ms
    17/03/07 17:06:44 INFO SparkContext: Starting job: count at CustomerKafkaConsumerThread.java:83
    17/03/07 17:06:44 INFO DAGScheduler: Got job 1 (count at CustomerKafkaConsumerThread.java:83) with 1 output partitions
    17/03/07 17:06:44 INFO DAGScheduler: Final stage: ResultStage 1 (count at CustomerKafkaConsumerThread.java:83)
    17/03/07 17:06:44 INFO DAGScheduler: Parents of final stage: List()
    17/03/07 17:06:44 INFO DAGScheduler: Missing parents: List()
    17/03/07 17:06:44 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at map at CustomerKafkaConsumerThread.java:75), which has no missing parents
    17/03/07 17:06:44 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.2 KB, free 961.9 MB)
    17/03/07 17:06:44 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1940.0 B, free 961.9 MB)
    17/03/07 17:06:44 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:51356 (size: 1940.0 B, free: 961.9 MB)
    17/03/07 17:06:44 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
    17/03/07 17:06:44 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at map at CustomerKafkaConsumerThread.java:75)
    17/03/07 17:06:44 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
    17/03/07 17:06:44 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,ANY, 1995 bytes)
    17/03/07 17:06:44 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
    17/03/07 17:06:44 INFO KafkaRDD: Beginning offset 0 is the same as ending offset skipping iot 0
    17/03/07 17:06:44 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 953 bytes result sent to driver
    17/03/07 17:06:44 INFO DAGScheduler: ResultStage 1 (count at CustomerKafkaConsumerThread.java:83) finished in 0.018 s
    17/03/07 17:06:44 INFO DAGScheduler: Job 1 finished: count at CustomerKafkaConsumerThread.java:83, took 0.040656 s
    17/03/07 17:06:44 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 16 ms on localhost (1/1)
    17/03/07 17:06:44 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
    17/03/07 17:06:44 INFO JobScheduler: Finished job streaming job 1488886604000 ms.0 from job set of time 1488886604000 ms
    17/03/07 17:06:44 INFO JobScheduler: Total delay: 0.848 s for time 1488886604000 ms (execution: 0.097 s)
    17/03/07 17:06:44 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
    17/03/07 17:06:44 INFO InputInfoTracker: remove old batch metadata: 
    17/03/07 17:06:44 INFO MapPartitionsRDD: Removing RDD 1 from persistence list
    17/03/07 17:06:44 INFO KafkaRDD: Removing RDD 0 from persistence list
    17/03/07 17:06:44 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer()
    17/03/07 17:06:44 INFO InputInfoTracker: remove old batch metadata: 
    17/03/07 17:06:44 INFO BlockManager: Removing RDD 1
    17/03/07 17:06:44 INFO BlockManager: Removing RDD 0
    17/03/07 17:06:46 INFO JobScheduler: Added jobs for time 1488886606000 ms
这是我的火花代码

import java.io.FileInputStream;

import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.Set;
import java.util.regex.Pattern;

import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.spark.SparkConf;
import org.apache.spark.SparkContext;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaPairDStream;
import org.apache.spark.streaming.api.java.JavaPairInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka.KafkaUtils;

import kafka.serializer.StringDecoder;
import properties.PropertyCache;
import scala.Tuple2;

public class CustomerKafkaConsumerThread implements Serializable  {
    String broker;
    String jars[]={"C:\\iot-kafka-producer-1.0.0.jar"};
    private static final Pattern SPACE = Pattern.compile(" ");

    public void sparkKafkaConsumer(String topics, String broker) throws InterruptedException {
        System.out.println("INSIDE SPARK KAFKACONSUMER METHOD..........");

        this.broker = broker;
        SparkConf conf = new SparkConf().setAppName("CustomerKafkaConsumerThread")
        .set("spark.local.ip", "10.41.81.17")
        .setMaster("local[*]").setJars(jars);
   /* .setJars(new String[]{                
                "C:/Users/pusarla/workspace/spark/iot-kafka-producer/target/iot-kafka-producer-1.0.0.jar"

        });*/




        JavaStreamingContext jssc = new JavaStreamingContext(conf, new Duration(2000));

        Map<String, String> kafkaParams = new HashMap<String, String>();
        kafkaParams.put("metadata.broker.list", broker);

        Set<String> topicSet = Collections.singleton(topics);


        System.out.println("Creating direct kafka stream with brokers and topics..........");
        // Create direct kafka stream with brokers and topics
        JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(jssc, String.class, String.class,
                StringDecoder.class, StringDecoder.class, kafkaParams, topicSet);


        JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {
            public String call(Tuple2<String, String> tuple2) {
                return tuple2._2();
            }
        });

        lines.foreachRDD(rdd -> {

            if (rdd.count() > 0) {
                List<String> strArray = rdd.collect();
                Iterator<String> topicData=strArray.iterator();
                while(topicData.hasNext()){

                    System.out.println("PRINTING PTINTING >>>>>>>>>>>>" +topicData.next());
                }


            }
        });

        jssc.start();
        jssc.awaitTermination();

    }
}
import java.io.FileInputStream;
导入java.io.FileNotFoundException;
导入java.io.IOException;
导入java.io.InputStream;
导入java.io.Serializable;
导入java.util.ArrayList;
导入java.util.array;
导入java.util.Collection;
导入java.util.Collections;
导入java.util.HashMap;
导入java.util.HashSet;
导入java.util.Iterator;
导入java.util.List;
导入java.util.Map;
导入java.util.Properties;
导入java.util.Set;
导入java.util.regex.Pattern;
导入org.apache.kafka.clients.consumer.ConsumerRecord;
导入org.apache.kafka.clients.consumer.ConsumerRecords;
导入org.apache.kafka.clients.consumer.KafkaConsumer;
导入org.apache.kafka.common.serialization.StringDeserializer;
导入org.apache.spark.SparkConf;
导入org.apache.spark.SparkContext;
导入org.apache.spark.api.java.JavaSparkContext;
导入org.apache.spark.api.java.function.function;
导入org.apache.spark.streaming.Duration;
导入org.apache.spark.streaming.api.java.JavaDStream;
导入org.apache.spark.streaming.api.java.JavaPairDStream;
导入org.apache.spark.streaming.api.java.JavaPairInputStream;
导入org.apache.spark.streaming.api.java.JavaStreamingContext;
导入org.apache.spark.streaming.kafka.KafkaUtils;
导入kafka.serializer.StringDecoder;
导入properties.PropertyCache;
导入scala.Tuple2;
公共类CustomerKafkanConsumerThread实现可序列化{
字符串代理;
字符串jars[]={“C:\\iot-kafka-producer-1.0.0.jar”};
私有静态最终模式空间=Pattern.compile(“”);
public void sparkkafaconsumer(字符串主题,字符串代理)抛出InterruptedException{
System.out.println(“内火花卡夫卡消费法”);
this.broker=经纪人;
SparkConf conf=new SparkConf().setAppName(“CustomerKafkaConsumerThread”)
.set(“spark.local.ip”,“10.41.81.17”)
.setMaster(“本地[*]”)。setJars(jars);
/*.setJars(新字符串[]{
“C:/Users/pusalla/workspace/spark/iot-kafka-producer/target/iot-kafka-producer-1.0.0.jar”
});*/
JavaStreamingContext jssc=新的JavaStreamingContext(conf,新的持续时间(2000));
Map kafkaParams=新HashMap();
kafkaParams.put(“metadata.broker.list”,broker);
Set topicSet=Collections.singleton(主题);
System.out.println(“创建带有代理和主题的直接卡夫卡流……);
//创建带有代理和主题的直接卡夫卡流
JavaPairInputStream messages=KafkaUtils.createDirectStream(jssc,String.class,String.class,
StringDecoder.class、StringDecoder.class、kafkaParams、topicSet);
JavadStreamLines=messages.map(新函数(){
公共字符串调用(Tuple2 Tuple2){
返回tuple2._2();
}
});
行。foreachRDD(rdd->{
如果(rdd.count()>0){
List strArray=rdd.collect();
迭代器topicData=strArray.Iterator();
while(topicData.hasNext()){
System.out.println(“打印选项>>>>>>>>”+topicData.next());
}
}
});
jssc.start();
jssc.aittimination();
}
}

请在此处发布您的Spark代码,如果没有它,我们将无法帮助hi@T.Gawęda我添加了Spark代码。将数据发布到kafka topic并再次测试@OmkarPuttagunta我正在向卡夫卡主题发送数据。。我无法看到spark consumer..程序中的数据。我只在eclipse控制台中获得我在文章中提到的输出。