Hadoop 配置单元：与窗口函数一起使用滞后时出现异常_Hadoop_Hive

Hadoop 配置单元：与窗口函数一起使用滞后时出现异常

hadoop hive

Hadoop 配置单元：与窗口函数一起使用滞后时出现异常,hadoop,hive,Hadoop,Hive,我试图计算两行之间的时间差，并应用问题的解。但我有一个例外： > org.apache.hive.service.cli.HiveSQLException: Error while compiling > statement: FAILED: SemanticException Failed to breakup Windowing > invocations into Groups. At least 1 group must only depend on input >

我试图计算两行之间的时间差，并应用问题的解。但我有一个例外：

> org.apache.hive.service.cli.HiveSQLException: Error while compiling
> statement: FAILED: SemanticException Failed to breakup Windowing
> invocations into Groups. At least 1 group must only depend on input
> columns. Also check for circular dependencies. Underlying error:
> Expecting left window frame boundary for function
> LAG((tok_table_or_col time), 1, 0) Window
> Spec=[PartitioningSpec=[partitionColumns=[(tok_table_or_col
> client_id)]orderColumns=[(tok_table_or_col time) ASC
> NULLS_FIRST]]window(type=ROWS, start=1 PRECEDING, end=currentRow)] as
> LAG_window_0 to be unbounded. Found : 1

HiveQL：

SELECT id, loc, LAG(time, 1, 0) OVER (PARTITION BY id, loc ORDER BY time ROWS 1 PRECEDING) - time AS response_time FROM mytable

我该如何解决这个问题？问题是什么

编辑：

样本数据：

id  loc time
0   1   1414250523591
0   1   1414250523655
1   2   1414250523655
1   2   1414250523661
1   3   1414250523661
1   3   1414250523662

我想要的是相同id和loc的行之间的时间差（总是2对）

编辑2：我还应该提到我是hadoop/hive生态系统的新手

因此，正如错误所说，窗口应该是无界的。所以我刚刚删除了ROWS子句，现在至少它正在做一些事情，但它仍然是错误的。所以我只想检查一下滞后值实际上是什么：

SELECT id, loc, LAG(time, 1) OVER (PARTITION BY id, loc ORDER BY time) AS lag_col FROM mytable

我得到这个作为输出：

id  loc lag_col
1   2   null
1   2   -1
1   3   null
1   3   -1

空值是明确的，因为我删除了默认值，但为什么是-1？时间列中的大值是否会导致某种溢出？列被定义为bigint，因此它实际上应该适合，没有问题，但在查询过程中可能会转换为int？

它在loc和LAG上有语法错误。请向我们展示样品数据和预期结果。谢谢。这实际上是混淆造成的错误