Oracle 查找当前月份和上个月值的总和
我有一个包含每个月员工帐户详细信息的源表,日期为字符串类型(yyyyMMdd)。尝试查找每个帐户的当前月值和上个月值之和Oracle 查找当前月份和上个月值的总和,oracle,hive,impala,Oracle,Hive,Impala,我有一个包含每个月员工帐户详细信息的源表,日期为字符串类型(yyyyMMdd)。尝试查找每个帐户的当前月值和上个月值之和 Source data: +-----------+-------------+-----------+----------+ | date | account | division | amount | +-----------+-------------+-----------+----------+ | 20190331 | 123
Source data:
+-----------+-------------+-----------+----------+
| date | account | division | amount |
+-----------+-------------+-----------+----------+
| 20190331 | 123 | AB0 | 100 |
+-----------+-------------+-----------+----------+
| 20190331 | 123 | AB1 | 110 |
+-----------+-------------+-----------+----------+
| 20190331 | 123 | AB2 | 120 |
+-----------+-------------+-----------+----------+
| 20190228 | 123 | AB4 | 100 |
+-----------+-------------+-----------+----------+
| 20190228 | 123 | AB1 | 100 |
+-----------+-------------+-----------+----------+
| 20190228 | 123 | AB2 | 100 |
+-----------+-------------+-----------+----------+
| 20190131 | 123 | AB0 | 100 |
+-----------+-------------+-----------+----------+
在impala中运行下面的查询,但这将返回当前和上个月相同的结果
select distinct * from (
SELECT
sum(amount) over (partition BY account, a.date) AS asset_current,
sum(amount) over (partition BY account, from_unixtime(unix_timestamp(to_date(LAST_DAY(ADD_MONTHS(to_timestamp(data_as_of_date,'yyyyMMdd'),-1))),'yyyy-MM-dd'),'yyyyMMdd')) AS asset_previous,
account,
date,
FROM employee_assets a
)x ;
预期产出:
+-----------+-------------+--------------------+----------------------+
| date | account | current_month_sum | previous_month_sum |
+-----------+-------------+--------------------+----------------------+
| 20190331 | 123 | 330 | 300 |
+-----------+-------------+--------------------+----------------------+
| 20190228 | 123 | 300 | 100 |
+-----------+-------------+--------------------+----------------------+
| 20190131 | 123 | 100 | 0 |
+-----------+-------------+--------------------+----------------------+
我已经使用了下面的查询,但是如果上个月的数据不可用,它将返回上个月的asset_
SELECT
x.*,
LAG(current_month_sum, 1, 0) OVER(PARTITION BY account ORDER BY adate) previous_month_sum
FROM (
SELECT adate, account, SUM(amount) current_month_sum
FROM employee_assets
GROUP BY adate, account
) x
ORDER BY adate DESC
例如:我们没有账户123的20181231的输入数据,因此1月份的asset_prev应为0,但查询返回500(即2018年11月的金额)
输入数据:
+-----------+-------------+-----------+----------+
| date | account | division | amount |
+-----------+-------------+-----------+----------+
| 20190331 | 123 | AB0 | 100 |
+-----------+-------------+-----------+----------+
| 20190331 | 123 | AB1 | 110 |
+-----------+-------------+-----------+----------+
| 20190331 | 123 | AB2 | 120 |
+-----------+-------------+-----------+----------+
| 20190228 | 123 | AB4 | 100 |
+-----------+-------------+-----------+----------+
| 20190228 | 123 | AB1 | 100 |
+-----------+-------------+-----------+----------+
| 20190228 | 123 | AB2 | 100 |
+-----------+-------------+-----------+----------+
| 20190131 | 123 | AB0 | 100 |
+-----------+-------------+-----------+----------+
| 20181130 | 123 | ABX | 500 |
+-----------+-------------+-----------+----------+
查询正在返回:
+-----------+-------------+--------------------+----------------------+
| date | account | current_month_sum | previous_month_sum |
+-----------+-------------+--------------------+----------------------+
| 20190331 | 123 | 330 | 300 |
+-----------+-------------+--------------------+----------------------+
| 20190228 | 123 | 300 | 100 |
+-----------+-------------+--------------------+----------------------+
| 20190131 | 123 | 100 | 500 |
+-----------+-------------+--------------------+----------------------+
| 20191131 | 123 | 500 | 0 |
+-----------+-------------+--------------------+----------------------+
预期产出:
+-----------+-------------+--------------------+----------------------+
| date | account | current_month_sum | previous_month_sum |
+-----------+-------------+--------------------+----------------------+
| 20190331 | 123 | 330 | 300 |
+-----------+-------------+--------------------+----------------------+
| 20190228 | 123 | 300 | 100 |
+-----------+-------------+--------------------+----------------------+
| 20190131 | 123 | 100 | 0 |
+-----------+-------------+--------------------+----------------------+
| 20191131 | 123 | 500 | 0 |
+-----------+-------------+--------------------+----------------------+
您可以在内部查询中使用聚合,在外部查询中使用
LAG()
在account
分区中获取上个月的值。LAG()
的三个参数形式允许您指定默认值
SELECT
x.*,
LAG(current_month_sum, 1, 0) OVER(PARTITION BY account ORDER BY adate) previous_month_sum
FROM (
SELECT adate, account, SUM(amount) current_month_sum
FROM employee_assets
GROUP BY adate, account
) x
ORDER BY adate DESC
注意:
date
不是列名的好选择,因为它可能与保留字冲突。我在查询中将该列重命名为adate
。非常感谢您的解决方案。它返回了实际数据的正确结果。欢迎@asarf!很高兴为您提供帮助。如果我们在所有月份都有可用的帐户,查询就可以了。我有一个场景,我在20190531没有账户“123”,但它在20190630和20190430。因此,在“20190630”月份,其返回的“20190430”月份数据为上月值。在这种情况下,它应该返回“0”,因为在5月份,帐户123不存在。请帮助更新此查询。请帮助更新此查询。从unix时间戳(unix时间戳)到日期(最后一天)(添加月数(到时间戳(数据作为日期的日期,'yyyyMMdd'),-1)),'yyyyy-MM-dd'),'yyyyyyymmdd')的的原因是什么,为什么要乱用日期和unix时间戳?数据作为日期的数据类型是什么?