SQL:连续行之间的差异
表有3列:订单id、成员id、订单日期 需要拉取按天数细分的订单分布,b/w按会员id列出2个连续订单 我所拥有的是:SQL:连续行之间的差异,sql,datetime,hive,window-functions,datediff,Sql,Datetime,Hive,Window Functions,Datediff,表有3列:订单id、成员id、订单日期 需要拉取按天数细分的订单分布,b/w按会员id列出2个连续订单 我所拥有的是: SELECT a1.member_id, count(distinct a1.order_id) as num_orders, a1.order_date, DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order from orders as a1 inner join o
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id+1;
这并不能完全帮助我,因为我需要的输出是:
您可以使用lag获取同一客户上一次订单的日期:
select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o
当同一日期有两行时,首先考虑最小的订单号。还请注意,我修复了datediff语法:在Hive中,函数只接受两个日期,没有单位
我只是不明白您想要计算num_订单的逻辑。您可以使用lag来获取同一客户上一次订单的日期:
select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o
当同一日期有两行时,首先考虑最小的订单号。还请注意,我修复了datediff语法:在Hive中,函数只接受两个日期,没有单位
我只是不明白您想要计算num_顺序的逻辑。可能是这样的:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id
where not exists (
select intermediate_order
from orders as intermedite_order
where intermediate_order.order_date < a1.order_date and intermediate_order.order_date > a2.order_date) ;
可能是这样的:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id
where not exists (
select intermediate_order
from orders as intermedite_order
where intermediate_order.order_date < a1.order_date and intermediate_order.order_date > a2.order_date) ;
计算num_顺序的逻辑是什么?我不明白,对不起,那是什么逻辑?例如,对于客户22222,如何为订单1212和1215获取2,为以下两个订单获取1?这是一个项目类型的订单id。一个人可以点两份相同的东西,比如两杯咖啡。咖啡的订单id是相同的,例如:138但num_orders是2,但这不是强制性的,我想计算的是会员的上次订单列之后的天数,计算num_orders的逻辑是什么?我不明白,对不起,那是什么逻辑?例如,对于客户22222,如何为订单1212和1215获取2,为以下两个订单获取1?这是一个项目类型的订单id。一个人可以点两份相同的东西,比如两杯咖啡。咖啡的订单id是相同的,例如:138,但订单数量是2,但这不是强制性的,我想计算的是会员上次订单列的天数。谢谢分享,我对滞后函数不熟悉,所以尝试自加入。您的查询实际上有所帮助,它是有效的。我想我会在这个查询/表格上计算num_orders谢谢分享,我不熟悉lag函数,所以我尝试了自连接。您的查询实际上有所帮助,它是有效的。我想我会在这个查询/表格上计算num_订单