选择两个日期之间记录的平均值SQL Netezza
我有两个表,第一个叫做Activations,有两列:Line\u ID,Activation\u Date。 第二个名为Speed的表有以下列:Line_ID、From_Date、To_Date、Record 第一个表格示例:选择两个日期之间记录的平均值SQL Netezza,sql,netezza,gaps-and-islands,Sql,Netezza,Gaps And Islands,我有两个表,第一个叫做Activations,有两列:Line\u ID,Activation\u Date。 第二个名为Speed的表有以下列:Line_ID、From_Date、To_Date、Record 第一个表格示例: |Line_ID| Activation_Date| |-------+----------------| |123456 | 1-Jan | |345678 | 2-Jan | |987654 | 3-Jan |
|Line_ID| Activation_Date|
|-------+----------------|
|123456 | 1-Jan |
|345678 | 2-Jan |
|987654 | 3-Jan |
...
第二个差距和岛屿表:
|Line_ID|From_Date| To_Date |Speed|
|-------+---------+---------+-----|
|123456 |1-Jan |4-Jan |70 |
|123456 |4-Jan |7-Jan |51 |
|123456 |7-Jan |10-Jan |48 |
|123456 |10-Jan |15-Jan |40 |
|123456 |15-Jan |17-Jan |70 |
|123456 |17-Jan |19-Jan |54 |
|123456 |19-Jan |21-Jan |94 |
|123456 |21-Jan |28-Jan |91 |
|123456 |28-Jan |31-Jan |35 |
...
我需要将激活表与记录表连接起来,以便在激活表中添加4列,但有一些问题
第一个:从激活日期开始的前7天记录的平均速度。
第二个:自激活日期起第二个7天记录的平均速度。
3ed:自激活日期起第三个7天记录的平均速度。
第4天:自激活日期起第4个7天记录的平均速度。
结果如下所示
|Line_ID| Activation_Date|AVG_SPEED_Week1|AVG_SPEED_Week2|AVG_SPEED_Week3|AVG_SPEED_Week4|
|-------+----------------+---------------+---------------+---------------+---------------|
|123456 | 1-Jan |60.5 |44 |72.6 |91 |
...
结果探索
AVG_SPEED_Week1: Average of Speed in the 1st 7 days starting Records.From_Date: 1-Jan Records.To_Date: 7-Jan
AVG_SPEED_Week2: Average of Speed in the 2nd 7 days starting Records.From_Date: 8-Jan Records.To_Date: 14-Jan
AVG_SPEED_Week3: Average of Speed in the 2nd 7 days starting Records.From_Date: 15-Jan Records.To_Date: 21-Jan
AVG_SPEED_Week4: Average of Speed in the 2nd 7 days starting Records.From_Date: 22-Jan Records.To_Date: 28-Jan
我没法测试,但那怎么样
SELECT a.Line_ID
,a.Activation_Date
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-7,s.To_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week1
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-14,s.To_Date) >= a.Activation_Date AND DATEADD(day,-7,s.From_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week2
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-21,s.To_Date) >= a.Activation_Date AND DATEADD(day,-14,s.From_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week3
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-28,s.To_Date) >= a.Activation_Date AND DATEADD(day,-21,s.From_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week4
FROM Activations a
JOIN Speed s
ON a.Line_ID=s.Line_ID
GROUP BY a.Line_ID, a.Activation_Date
我假设您不需要动态计算并生成任意周数的平均速度,4周就足够了
它肯定需要测试。我无法测试它,但那怎么样
SELECT a.Line_ID
,a.Activation_Date
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-7,s.To_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week1
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-14,s.To_Date) >= a.Activation_Date AND DATEADD(day,-7,s.From_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week2
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-21,s.To_Date) >= a.Activation_Date AND DATEADD(day,-14,s.From_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week3
,CASE WHEN a.Activation_Date >= s.From_Date AND a.Activation_Date <= s.To_Date AND DATEADD(day,-28,s.To_Date) >= a.Activation_Date AND DATEADD(day,-21,s.From_Date) >= a.Activation_Date THEN AVG(SUM(s.Speed)) END AVG_SPEED_Week4
FROM Activations a
JOIN Speed s
ON a.Line_ID=s.Line_ID
GROUP BY a.Line_ID, a.Activation_Date
我假设您不需要动态计算并生成任意周数的平均速度,4周就足够了
它肯定需要测试。我会扩展数据并聚合:
with s as (
select s.*, s.from_date + n.idx * interval '1 day' as dte
from speed s join
_V_VECTOR_IDX n
on s.to_date <= s.from_date + n.idx * interval '1 day'
)
select a.line_id,
avg(case when s.dte between a.activation_date and a.activation_date + interval '6 day' then s.speed end),
avg(case when s.dte between a.activation_date + interval '7 day' and a.activation_date + interval '13 day' then s.speed end),
avg(case when s.dte between a.activation_date + interval '14 day' and a.activation_date + interval '20 day' then s.speed end),
avg(case when s.dte between a.activation_date + interval '21 day' and a.activation_date + interval '27 day' then s.speed end)
from activations a left join
s
on a.line_id = s.line_id
group by a.line_id, a.activation_date;
这假设时间段少于1000天左右。我将展开数据并聚合:
with s as (
select s.*, s.from_date + n.idx * interval '1 day' as dte
from speed s join
_V_VECTOR_IDX n
on s.to_date <= s.from_date + n.idx * interval '1 day'
)
select a.line_id,
avg(case when s.dte between a.activation_date and a.activation_date + interval '6 day' then s.speed end),
avg(case when s.dte between a.activation_date + interval '7 day' and a.activation_date + interval '13 day' then s.speed end),
avg(case when s.dte between a.activation_date + interval '14 day' and a.activation_date + interval '20 day' then s.speed end),
avg(case when s.dte between a.activation_date + interval '21 day' and a.activation_date + interval '27 day' then s.speed end)
from activations a left join
s
on a.line_id = s.line_id
group by a.line_id, a.activation_date;
这假设时间段少于1000天左右。我不理解第二个表。持续时间总是一天吗?@GordonLinoff我将第二张表的样本编辑为gaps and Island我不理解第二张表。持续时间总是一天吗?@GordonLinoff我已经将第二个表的示例编辑为gaps和Islandi,我应用了上面提到的代码,但没有给出真正的结果。如果线路ID的有效日期为2020-08-19,那么第一个平均值的起始日期和截止日期是什么?为什么DTE列在From之后2天_Date@AhmedAbdelkader . . . 我的理解是,这应该是你想要的,但也许你需要加减一天才能得到准确的结果。我应用了上面提到的代码,但它没有给我真正的结果。如果线路ID的有效日期为2020-08-19,那么第一个平均值的起始日期和截止日期是什么?为什么DTE列在From之后2天_Date@AhmedAbdelkader . . . 我的理解是,这应该是你想要的,但也许你需要增加或减少一天来获得确切的结果。