Java 什么是'；输入端没有可行的替代方案'；对于spark sql？_Java_Apache Spark Sql

Java 什么是'；输入端没有可行的替代方案'；对于spark sql？

java

Java 什么是'；输入端没有可行的替代方案'；对于spark sql？,java,apache-spark-sql,Java,Apache Spark Sql,我有一个DF，它有startTimeUnix列（Mongo中的数字类型），其中包含历元时间戳。我想查询此列上的DF，但我想传递EST datetime。我在spark shell上测试了多个环： val df = Seq(("1", "1523937600000"), ("2", "1523941200000"),("3","1524024000000")).toDF("id", "unix") df.filter($"unix" > java.time.ZonedDateTime.pa

我有一个DF，它有

startTimeUnix

列（Mongo中的数字类型），其中包含历元时间戳。我想查询此列上的DF，但我想传递EST datetime。我在spark shell上测试了多个环：

val df = Seq(("1", "1523937600000"), ("2", "1523941200000"),("3","1524024000000")).toDF("id", "unix")

df.filter($"unix" > java.time.ZonedDateTime.parse("04/17/2018 01:00:00", java.time.format.DateTimeFormatter.ofPattern ("MM/dd/yyyy HH:mm:ss").withZone ( java.time.ZoneId.of("America/New_York"))).toEpochSecond()*1000).collect()

输出：

= Array([3,1524024000000])

由于java.time函数正在工作，我将其传递给

spark submit

，其中从Mongo检索数据时，过滤器查询如下所示：

startTimeUnix（java.time.ZonedDateTime.parse（${GT}，java.time.format.DateTimeFormatter.of模式（'MM/dd/yyyyyyyyhmmss'））.withZone（java.time.ZoneId.of（'America/New_York'））.toEpochSecond（）*1000）`

但是，我不断遇到以下错误：

= Array([3,1524024000000])

原因：org.apache.spark.sql.catalyst.parser.ParseException: 在输入（java.time.ZonedDateTime.parse（04/18/2018000000，java.time.format.DateTimeFormatter.of模式（'MM/dd/yyyyHHmmss'））时，没有可行的替代方案。带区域（'（第1行，位置138） ==SQL== startTimeUnix<（java.time.ZonedDateTime.parse（04/18/2018000000，java.time.format.DateTimeFormatter.of模式（'MM/dd/yyyyyhhmmss'）。withZone（java.time.ZoneId.of（'America/New_York'））.Toepochssecond（）.toString（）和startTimeUnix>（java.time.ZonedDateTime.parse（04/17/2018000000，java.time.format.DateTimeFormatter.of模式（'MM/dd/yyyyyyyyhhmmss'））.withZone（java.time.ZoneId.of（'America/New_York'））.toEpochSecond（）*1000.toString（）位于org.apache.spark.sql.catalyst.parser.ParseException.withCommand（ParseDriver.scala:217）位于org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse（ParseDriver.scala:114）位于org.apache.spark.sql.execution.SparkSqlParser.parse（SparkSqlParser.scala:48）位于org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseExpression（ParseDriver.scala:43）位于org.apache.spark.sql.Dataset.filter（Dataset.scala:1315）

在某个地方，它说错误意味着数据类型不匹配。我尝试将toString应用于日期转换的输出，但没有成功。

您可以使用spark数据帧函数

scala> val df = Seq(("1", "1523937600000"), ("2", "1523941200000"),("3","1524024000000")).toDF("id", "unix")
df: org.apache.spark.sql.DataFrame = [id: string, unix: string]

scala> df.filter($"unix" > unix_timestamp()*1000).collect()
res5: Array[org.apache.spark.sql.Row] = Array([3,1524024000000])
scala> df.withColumn("unixinEST"
                        ,from_utc_timestamp(
                            from_unixtime(unix_timestamp()),
                             "EST"))
         .show()
+---+-------------+-------------------+
| id|         unix|          unixinEST|
+---+-------------+-------------------+
|  1|1523937600000|2018-04-18 06:13:19|
|  2|1523941200000|2018-04-18 06:13:19|
|  3|1524024000000|2018-04-18 06:13:19|
+---+-------------+-------------------+

我读到unix-timestamp（）将日期列值转换为unix。在我的例子中，DF包含unix格式的日期，需要将其与输入值（EST datetime）进行比较我传递的是$LT，$GT。您的要求在这个问题上并不明确。但我用我所了解的更新了答案。如果有帮助，请告诉我。您可以使用自己的Unix时间戳，而不是我使用函数Unix_timestamp（）生成它