Sql 红移-窗口功能-获取每行上一小时的统计数据
我正在尝试根据产品订单编写红移查询。 该表包含存储标识、订单号、订单日期时间、产品订单、订单时间等列。我试图编写的查询将从此表中选择,并且对于每一行,它将包含一些基于该商店前一小时订单的统计信息 目前我可以做一些类似的事情:Sql 红移-窗口功能-获取每行上一小时的统计数据,sql,amazon-redshift,Sql,Amazon Redshift,我正在尝试根据产品订单编写红移查询。 该表包含存储标识、订单号、订单日期时间、产品订单、订单时间等列。我试图编写的查询将从此表中选择,并且对于每一行,它将包含一些基于该商店前一小时订单的统计信息 目前我可以做一些类似的事情: SELECT store_id, order_number, order_datetime, products_ordered, order_time, (SELECT COUNT(*) FROM mtable WHERE store_id=o.store_i
SELECT store_id, order_number, order_datetime, products_ordered, order_time,
(SELECT COUNT(*) FROM mtable WHERE store_id=o.store_id AND order_time BETWEEN (o.order_time - interval '1 hour') AND o.order_time) as prev_num_orders,
(SELECT AVG(products_ordered) FROM mtable WHERE store_id=o.store_id AND order_time BETWEEN (o.order_time - interval '1 hour') AND o.order_time) as prev_avg_orders
FROM mtable o;
这台电视机的性能太差了。其中一个主要原因可能是我必须查找前几个小时的订单两次,以获得两个不同的统计数据。有没有办法优化这一点?我认为应该有一个窗口功能,但我不确定。我没有测试de性能的数据,但当我在红移上遇到类似问题时,我就是这样做的:
with cte as
(
SELECT store_id, order_number, order_datetime, products_ordered, order_time,
LAG (products_ordered,1) OVER (PARTITION BY store_id ORDER BY order_time) AS prev_products_ordered
from mtable
)
select store_id, order_number, order_datetime, products_ordered, order_time,
count(*) as prev_num_orders, avg(prev_products_ordered) as prev_avg_orders
from cte
group by store_id, order_number, order_datetime, products_ordered, order_time
对于这种情况,我想不出有效的窗口范围,因为值范围是唯一的公共因素。 由于Redshift在大型数据集上非常出色,我建议以下解决方案:
SELECT store_id,
order_number,
order_datetime,
products_ordered,
order_time,
COUNT(prev_orders.store_id) prev_num_orders,
AVG(prev_orders.products_ordered) prev_avg_orders
FROM mtable o
left join mtable prev_orders on prev_orders.store_id=o.store_id
AND prev_orders.order_time BETWEEN (o.order_time - interval '1 hour') AND o.order_time
--and o.order_number != prev_orders.order_number
group by store_id,
order_number,
order_datetime,
products_ordered,
order_time;
请注意,prev_num_orders和prev_avg_orders列统计信息也将包括当前订单。要从统计数据中删除当前订单,请从SQL语句中取消注释订单号比较行。示例数据和所需结果将非常有用。