Oracle 查找当前月份和上个月值的总和

Oracle 查找当前月份和上个月值的总和,oracle,hive,impala,Oracle,Hive,Impala,我有一个包含每个月员工帐户详细信息的源表,日期为字符串类型(yyyyMMdd)。尝试查找每个帐户的当前月值和上个月值之和 Source data: +-----------+-------------+-----------+----------+ | date | account | division | amount | +-----------+-------------+-----------+----------+ | 20190331 | 123

我有一个包含每个月员工帐户详细信息的源表,日期为字符串类型(yyyyMMdd)。尝试查找每个帐户的当前月值和上个月值之和

Source data:

+-----------+-------------+-----------+----------+
|  date     | account     | division  |  amount  |
+-----------+-------------+-----------+----------+
| 20190331  | 123         | AB0       | 100      |
+-----------+-------------+-----------+----------+
| 20190331  | 123         | AB1       | 110      |
+-----------+-------------+-----------+----------+
| 20190331  | 123         | AB2       | 120      |
+-----------+-------------+-----------+----------+
| 20190228  | 123         | AB4       | 100      |
+-----------+-------------+-----------+----------+
| 20190228  | 123         | AB1       | 100      |
+-----------+-------------+-----------+----------+
| 20190228  | 123         | AB2       | 100      |
+-----------+-------------+-----------+----------+
| 20190131  | 123         | AB0       | 100      |
+-----------+-------------+-----------+----------+
在impala中运行下面的查询,但这将返回当前和上个月相同的结果

select distinct * from (
SELECT 
sum(amount) over (partition BY account, a.date) AS asset_current,
sum(amount) over (partition BY account, from_unixtime(unix_timestamp(to_date(LAST_DAY(ADD_MONTHS(to_timestamp(data_as_of_date,'yyyyMMdd'),-1))),'yyyy-MM-dd'),'yyyyMMdd')) AS asset_previous,
     account,
     date,
FROM employee_assets a
)x ;
预期产出:

+-----------+-------------+--------------------+----------------------+
|  date     | account     | current_month_sum  |  previous_month_sum  |
+-----------+-------------+--------------------+----------------------+
| 20190331  | 123         | 330                | 300                  |
+-----------+-------------+--------------------+----------------------+
| 20190228  | 123         | 300                | 100                  |
+-----------+-------------+--------------------+----------------------+
| 20190131  | 123         | 100                | 0                    |
+-----------+-------------+--------------------+----------------------+

我已经使用了下面的查询,但是如果上个月的数据不可用,它将返回上个月的asset_

SELECT
    x.*,
    LAG(current_month_sum, 1, 0) OVER(PARTITION BY account ORDER BY adate) previous_month_sum  
FROM (
    SELECT adate, account, SUM(amount) current_month_sum  
    FROM employee_assets
    GROUP BY adate, account
) x
ORDER BY adate DESC
例如:我们没有账户123的20181231的输入数据,因此1月份的asset_prev应为0,但查询返回500(即2018年11月的金额) 输入数据:

+-----------+-------------+-----------+----------+
|  date     | account     | division  |  amount  |
+-----------+-------------+-----------+----------+
| 20190331  | 123         | AB0       | 100      |
+-----------+-------------+-----------+----------+
| 20190331  | 123         | AB1       | 110      |
+-----------+-------------+-----------+----------+
| 20190331  | 123         | AB2       | 120      |
+-----------+-------------+-----------+----------+
| 20190228  | 123         | AB4       | 100      |
+-----------+-------------+-----------+----------+
| 20190228  | 123         | AB1       | 100      |
+-----------+-------------+-----------+----------+
| 20190228  | 123         | AB2       | 100      |
+-----------+-------------+-----------+----------+
| 20190131  | 123         | AB0       | 100      |
+-----------+-------------+-----------+----------+
| 20181130  | 123         | ABX       | 500      |
+-----------+-------------+-----------+----------+
查询正在返回:

+-----------+-------------+--------------------+----------------------+
|  date     | account     | current_month_sum  |  previous_month_sum  |
+-----------+-------------+--------------------+----------------------+
| 20190331  | 123         | 330                | 300                  |
+-----------+-------------+--------------------+----------------------+
| 20190228  | 123         | 300                | 100                  |
+-----------+-------------+--------------------+----------------------+
| 20190131  | 123         | 100                | 500                  |
+-----------+-------------+--------------------+----------------------+
| 20191131  | 123         | 500                | 0                    |
+-----------+-------------+--------------------+----------------------+
预期产出:

+-----------+-------------+--------------------+----------------------+
|  date     | account     | current_month_sum  |  previous_month_sum  |
+-----------+-------------+--------------------+----------------------+
| 20190331  | 123         | 330                | 300                  |
+-----------+-------------+--------------------+----------------------+
| 20190228  | 123         | 300                | 100                  |
+-----------+-------------+--------------------+----------------------+
| 20190131  | 123         | 100                | 0                    |
+-----------+-------------+--------------------+----------------------+
| 20191131  | 123         | 500                | 0                    |
+-----------+-------------+--------------------+----------------------+

您可以在内部查询中使用聚合,在外部查询中使用
LAG()
account
分区中获取上个月的值。
LAG()
的三个参数形式允许您指定默认值

SELECT
    x.*,
    LAG(current_month_sum, 1, 0) OVER(PARTITION BY account ORDER BY adate) previous_month_sum  
FROM (
    SELECT adate, account, SUM(amount) current_month_sum  
    FROM employee_assets
    GROUP BY adate, account
) x
ORDER BY adate DESC

注意:
date
不是列名的好选择,因为它可能与保留字冲突。我在查询中将该列重命名为
adate

非常感谢您的解决方案。它返回了实际数据的正确结果。欢迎@asarf!很高兴为您提供帮助。如果我们在所有月份都有可用的帐户,查询就可以了。我有一个场景,我在20190531没有账户“123”,但它在20190630和20190430。因此,在“20190630”月份,其返回的“20190430”月份数据为上月值。在这种情况下,它应该返回“0”,因为在5月份,帐户123不存在。请帮助更新此查询。请帮助更新此查询。从unix时间戳(unix时间戳)到日期(最后一天)(添加月数(到时间戳(数据作为日期的日期,'yyyyMMdd'),-1)),'yyyyy-MM-dd'),'yyyyyyymmdd')的
的原因是什么,为什么要乱用日期和unix时间戳?
数据作为日期的数据类型是什么?