Apache spark Spark cassandra sqlcontext和unix历元时间戳列
我有一个带有unix历元时间戳列的Cassandra表(值例如1599613045)。我想使用spark sqlcontext根据这个unix历元时间戳列从这个表中选择“从日期到日期”。我打算将从日期到日期输入转换为历元时间戳,并按照以下方法进行比较(>=&) 我们考虑卡桑德拉在本地主机上运行:9042 键空间-->mykeyspace 表-->mytable columnName-->时间戳 spark scala代码:Apache spark Spark cassandra sqlcontext和unix历元时间戳列,apache-spark,Apache Spark,我有一个带有unix历元时间戳列的Cassandra表(值例如1599613045)。我想使用spark sqlcontext根据这个unix历元时间戳列从这个表中选择“从日期到日期”。我打算将从日期到日期输入转换为历元时间戳,并按照以下方法进行比较(>=&) 我们考虑卡桑德拉在本地主机上运行:9042 键空间-->mykeyspace 表-->mytable columnName-->时间戳 spark scala代码: import org.apache.spark.sql.Spark
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
// create SparkSession
val spark=SparkSession.builder().master("local[*]").getOrCreate()
import spark.implicits._
//Read table from cassandra, spark-cassandra connector should be added to classpath
spark.conf.set("spark.cassandra.connection.host", "localhost")
spark.conf.set("spark.cassandra.connection.port", "9042")
var cassandraDF = spark.read.format("org.apache.spark.sql.cassandra")
.options(Map("keyspace" -> "mykeyspace", "table" -> "mytable")).load()
//select timestamp column
cassandraDF=cassandraDF.select('timestamp)
cassandraDF.show(false)
// let's consider following as the output
+----------+
| timestamp|
+----------+
|1576089000|
|1575916200|
|1590258600|
|1591900200|
+----------+
// To convert the above output to spark's default date format yyyy-MM-dd
val outDF=cassandraDF.withColumn("date",to_date(from_unixtime('timestamp)))
outDF.show(false)
+----------+----------+
| timestamp| date|
+----------+----------+
|1576089000|2019-12-12|
|1575916200|2019-12-10|
|1590258600|2020-05-24|
|1591900200|2020-06-12|
+----------+----------+
// You can proceed with next steps from here