Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala enocder问题-Spark结构化流媒体-仅适用于repl_Scala_Apache Spark_Apache Kafka_Spark Structured Streaming - Fatal编程技术网

Scala enocder问题-Spark结构化流媒体-仅适用于repl

Scala enocder问题-Spark结构化流媒体-仅适用于repl,scala,apache-spark,apache-kafka,spark-structured-streaming,Scala,Apache Spark,Apache Kafka,Spark Structured Streaming,我有一个工作流程,使用schema reg接收和反序列化kafka avro消息。它在REPL中工作得很好,但是当我尝试编译时 Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serial

我有一个工作流程,使用schema reg接收和反序列化kafka avro消息。它在REPL中工作得很好,但是当我尝试编译时

Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]       .map(x => {
我不确定是否需要修改我的对象,但如果REPL工作正常,我为什么需要修改呢

object AgentDeserializerWrapper {
      val props = new Properties()
      props.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, schemaRegistryURL)
      props.put(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, "true")
      val vProps = new kafka.utils.VerifiableProperties(props)
      val deser = new KafkaAvroDecoder(vProps)
      val avro_schema = new RestService(schemaRegistryURL).getLatestVersion(subjectValueNameAgentRead)
      val messageSchema = new Schema.Parser().parse(avro_schema.getSchema)
    }

    case class DeserializedFromKafkaRecord( value: String)

    import spark.implicits._

    val agentStringDF = spark
      .readStream
      .format("kafka")
      .option("subscribe", "agent")
      .options(kafkaParams)
      .load()
      .map(x => {
        DeserializedFromKafkaRecord(AgentDeserializerWrapper.deser.fromBytes(x.getAs[Array[Byte]]("value"), AgentDeserializerWrapper.messageSchema).asInstanceOf[GenericData.Record].toString)
      })

添加为[DeserializedFromKafCareCord],以便静态键入数据集:

val agentStringDF = spark
      .readStream
      .format("kafka")
      .option("subscribe", "agent")
      .options(kafkaParams)
      .load()
      .as[DeserializedFromKafkaRecord]
      .map(x => {
        DeserializedFromKafkaRecord(AgentDeserializerWrapper.deser.fromBytes(x.getAs[Array[Byte]]("value"), AgentDeserializerWrapper.messageSchema).asInstanceOf[GenericData.Record].toString)
      })

我还能够让它在主对象之外使用case类反序列化FromkafCareCord value:String。我不太清楚为什么会这样。这一定是一个范围问题。