Apache spark Spark cassandra sqlcontext和unix历元时间戳列

Apache spark Spark cassandra sqlcontext和unix历元时间戳列,apache-spark,Apache Spark,我有一个带有unix历元时间戳列的Cassandra表(值例如1599613045)。我想使用spark sqlcontext根据这个unix历元时间戳列从这个表中选择“从日期到日期”。我打算将从日期到日期输入转换为历元时间戳,并按照以下方法进行比较(>=&) 我们考虑卡桑德拉在本地主机上运行:9042 键空间-->mykeyspace 表-->mytable columnName-->时间戳 spark scala代码: import org.apache.spark.sql.Spark

我有一个带有unix历元时间戳列的Cassandra表(值例如1599613045)。我想使用spark sqlcontext根据这个unix历元时间戳列从这个表中选择“从日期到日期”。我打算将从日期到日期输入转换为历元时间戳,并按照以下方法进行比较(>=&)

我们考虑卡桑德拉在本地主机上运行:9042

键空间-->mykeyspace

表-->mytable

columnName-->时间戳

spark scala代码:

  import org.apache.spark.sql.SparkSession
  import org.apache.spark.sql.functions._

   // create SparkSession
  val spark=SparkSession.builder().master("local[*]").getOrCreate()
  import spark.implicits._

 //Read table from cassandra, spark-cassandra connector should be added to  classpath
 spark.conf.set("spark.cassandra.connection.host", "localhost")
 spark.conf.set("spark.cassandra.connection.port", "9042")

  var cassandraDF = spark.read.format("org.apache.spark.sql.cassandra")
        .options(Map("keyspace" -> "mykeyspace", "table" -> "mytable")).load()
  
 //select timestamp column
cassandraDF=cassandraDF.select('timestamp)
cassandraDF.show(false)

// let's consider following as the output

+----------+
| timestamp|
+----------+
|1576089000|
|1575916200|
|1590258600|
|1591900200|
+----------+

// To convert the above output to spark's default date format yyyy-MM-dd
val outDF=cassandraDF.withColumn("date",to_date(from_unixtime('timestamp)))
outDF.show(false)

+----------+----------+
| timestamp|      date|
+----------+----------+
|1576089000|2019-12-12|
|1575916200|2019-12-10|
|1590258600|2020-05-24|
|1591900200|2020-06-12|
+----------+----------+

// You can proceed with next steps from here