Scala 如何使用Spark SQL DataFrame在JDBC中持久化window()函数的输出?

Scala 如何使用Spark SQL DataFrame在JDBC中持久化window()函数的输出?,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,执行以下代码段时: ... stream .map(_.value()) .flatMap(MyParser.parse(_)) .foreachRDD(rdd => { val spark = SparkSession.builder.config(rdd.sparkContext.getConf).getOrCreate() import spark.implicits._ val dataFrame

执行以下代码段时:

...
stream
      .map(_.value())
      .flatMap(MyParser.parse(_))
      .foreachRDD(rdd => {
        val spark = SparkSession.builder.config(rdd.sparkContext.getConf).getOrCreate()
        import spark.implicits._

        val dataFrame = rdd.toDF();
        val countsDf = dataFrame.groupBy($"action", window($"time", "1 hour")).count()
        val query = countsDf.write.mode("append").jdbc(url, "stats_table", prop)
      })
....
发生此错误:
java.lang.IllegalArgumentException:无法获取结构的JDBC类型


如何将
org.apache.spark.sql.functions.window()函数的输出保存到MySQL数据库?

使用spark sql时,我遇到了同样的问题:

val query3 = dataFrame
  .groupBy(org.apache.spark.sql.functions.window($"timeStamp", "10 minutes"), $"data")
  .count()
  .writeStream
  .outputMode(OutputMode.Complete())
  .options(prop)
  .option("checkpointLocation", "file:///tmp/spark-checkpoint1")
  .option("table", "temp")
  .format("com.here.olympus.jdbc.sink.OlympusDBSinkProvider")
  .start
我通过添加一个用户定义的函数来解决这个问题

对于我来说,字符串可以工作,但您可以根据需要更改函数,甚至可以有两个函数分别返回start和end

我的查询更改为:

val query3 = dataFrame
  .groupBy(org.apache.spark.sql.functions.window($"timeStamp", "10 minutes"), $"data")
  .count()
  .withColumn("window",toString($"window"))
  .writeStream
  .outputMode(OutputMode.Complete())
  .options(prop)
  .option("checkpointLocation", "file:///tmp/spark-checkpoint1")
  .option("table", "temp")
  .format("com.here.olympus.jdbc.sink.OlympusDBSinkProvider")
  .start
val query3 = dataFrame
  .groupBy(org.apache.spark.sql.functions.window($"timeStamp", "10 minutes"), $"data")
  .count()
  .withColumn("window",toString($"window"))
  .writeStream
  .outputMode(OutputMode.Complete())
  .options(prop)
  .option("checkpointLocation", "file:///tmp/spark-checkpoint1")
  .option("table", "temp")
  .format("com.here.olympus.jdbc.sink.OlympusDBSinkProvider")
  .start