Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/mongodb/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 如何处理来自卡夫卡的avro格式的消息?_Scala_Apache Spark_Apache Kafka_Spark Streaming_Avro - Fatal编程技术网

Scala 如何处理来自卡夫卡的avro格式的消息?

Scala 如何处理来自卡夫卡的avro格式的消息?,scala,apache-spark,apache-kafka,spark-streaming,avro,Scala,Apache Spark,Apache Kafka,Spark Streaming,Avro,我试图在kafka控制台下实现consumercommandworks,并使用spark streaming以程序的形式输出预期的json数据功能 kafka-console-consumer.sh --zookeeper host.xxxx.com:2181,host.xxxx.com:2181 --topic mytopic --formatter CustomAvroMessageFormatter --property "formatter-schema-file= schema.txt

我试图在kafka控制台下实现consumercommandworks,并使用spark streaming以程序的形式输出预期的json数据功能

kafka-console-consumer.sh --zookeeper host.xxxx.com:2181,host.xxxx.com:2181 --topic mytopic --formatter CustomAvroMessageFormatter --property "formatter-schema-file= schema.txt" > /var/tmp/myfile.json&
我能够使用spark streaming以编程方式阅读上述主题的信息,如下scala代码,该代码运行良好:

object ConsumeTest {

def main(args: Array[String]) {
  val sc = new SparkContext("local[*]", "ConsumeKafkaMsg")
  sc.setLogLevel("ERROR")
  val ssc = new StreamingContext(sc, Seconds(1))

  //To read from server
  val kafkaParams = Map("metadata.broker.list" -> "brokername:9092")
  val topics = List("mytopic").toSet

  val lines = KafkaUtils.createDirectStream[
   String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topics).map(_._2)

  lines.print()

  ssc.start()
  ssc.awaitTermination()
  }

}
但是,上面的程序以二进制格式读取消息,如下所示:


��Cߣ�ߕ'윺~�_,��M˶/��Ѯ! 您有两种选择,都需要相当密集的编码,这是可以的,不是吗

编写您自己的自定义卡夫卡,并在示例中使用StringDecoder的地方使用它

为批处理加载数据集后,请使用foreach运算符对其进行转换,或使用映射转换将转换应用为管道的一部分

你也可以考虑使用库。