Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala writeStream()正在批处理数据中打印空值,即使我通过writeStream()在kafka中提供了正确的json数据_Scala_Apache Spark_Apache Kafka_Apache Spark Sql_Spark Structured Streaming - Fatal编程技术网

Scala writeStream()正在批处理数据中打印空值,即使我通过writeStream()在kafka中提供了正确的json数据

Scala writeStream()正在批处理数据中打印空值,即使我通过writeStream()在kafka中提供了正确的json数据,scala,apache-spark,apache-kafka,apache-spark-sql,spark-structured-streaming,Scala,Apache Spark,Apache Kafka,Apache Spark Sql,Spark Structured Streaming,我试图使用模式转换json并将值打印到控制台,但writeStream()在所有列中打印空值,即使我提供了正确的数据 我给卡夫卡主题的数据 {"stock":"SEE","buy":12,"sell":15,"profit":3,quantity:27,"loss":0,"gender":"M"} {"stock":"SEE","buy":12,"sell":15,"profit":3,quantity:27,"loss":0,"gender":"M"} {"stock":"SEE","buy"

我试图使用模式转换json并将值打印到控制台,但writeStream()在所有列中打印空值,即使我提供了正确的数据

我给卡夫卡主题的数据

{"stock":"SEE","buy":12,"sell":15,"profit":3,quantity:27,"loss":0,"gender":"M"}
{"stock":"SEE","buy":12,"sell":15,"profit":3,quantity:27,"loss":0,"gender":"M"}
{"stock":"SEE","buy":12,"sell":15,"profit":3,quantity:27,"loss":0,"gender":"M"}
下面是我的scala代码

 val readStreamDFInd = sparkSession.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092")
  .option("subscribe", "IndiaStocks")
  .option("startingOffsets", "earliest")
  .load()

//readStreamDFInd.printSchema()
val readStreamDFUS = sparkSession.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092")
  .option("subscribe", "USStocks")
  .option("startingOffsets", "earliest")
  .load()

val schema = new StructType()
  .add("stock", StringType)
  .add("buy", IntegerType)
  .add("sell", IntegerType)
  .add("profit", IntegerType)
  .add("quantity", IntegerType)
  .add("loss", IntegerType)
  .add("gender", StringType)

val stocksIndia = readStreamDFInd.selectExpr("CAST(value as STRING) as json").select(from_json($"json", schema).as("data")).select("data.*")
val stocksUSA = readStreamDFUS.selectExpr("CAST(value as STRING) as json").select(from_json($"json", schema).as("data")).select("data.*")
stocksIndia.printSchema() stocksUSA.writeStream
  .format("console")
  .outputMode("append").trigger(Trigger.ProcessingTime("5 seconds"))
  .start()
  .awaitTermination()

代码运行良好,您也可以在中看到

查看来自_json函数的
文档,会创建
null
值,因为字符串是不可解析的


=>您缺少json字符串中
数量
字段周围的引号。

问题在于您的卡夫卡数据中,数量列应该在引号中。请在下面查看

{“股票”:“见”,“买”:12,“卖”:15,“利润”:3,“数量”:27,“损失”:0,“性别”:“M”} {“股票”:“见”,“买”:12,“卖”:15,“利润”:3,“数量”:27,“损失”:0,“性别”:“M”} {“股票”:“见”,“买”:12,“卖”:15,“利润”:3,“数量”:27,“损失”:0,“性别”:“M”}