SQL执行最大值后的平均值
我有两张桌子<代码>表1:SQL执行最大值后的平均值,sql,oracle,join,max,average,Sql,Oracle,Join,Max,Average,我有两张桌子表1: | ID1 | ID2 | ID3 | ID4 | |-----+-----+-----+-----| | 200 | 125 | 300 | 201 | | 206 | 128 | 650 | 261 | | 230 | 543 | 989 | 403 | 和表2: | ID1 | ID2 | ID3 | ID4 | Date | Cost | |-----+-----+-----+-----+--------+------| | 200 | 125 | 300 |
| ID1 | ID2 | ID3 | ID4 |
|-----+-----+-----+-----|
| 200 | 125 | 300 | 201 |
| 206 | 128 | 650 | 261 |
| 230 | 543 | 989 | 403 |
和表2
:
| ID1 | ID2 | ID3 | ID4 | Date | Cost |
|-----+-----+-----+-----+--------+------|
| 200 | 125 | 300 | 201 | 1/1/19 | 0.32 |
| 200 | 125 | 300 | 201 | 1/1/19 | 0.33 |
| 200 | 125 | 300 | 201 | 1/1/19 | 0.34 |
| 200 | 125 | 300 | 201 | 1/2/13 | 0.00 |
| 200 | 125 | 300 | 201 | 9/5/05 | 0.01 |
我正在尝试将表1
加入表2
,同时过滤输出,以便仅显示那些分类中日期
最大的一行,并显示该最大日期的平均成本
。这是我目前的代码:
SELECT t1.ID1, t1.ID2, t1.ID3, t1.ID4, maxDate, avgCost
FROM Table1 t1
JOIN ( SELECT ID1, ID2, ID3, ID4, MAX(Date) as maxDate, AVG(Cost) as avgCost
FROM Table2 t2
GROUP BY ID1, ID2, ID3, ID4 ) t2
ON t2.ID1 = t1.ID1
AND t2.ID2 = t1.ID2
AND t2.ID3 = t1.ID3
AND t2.ID4 = t1.ID4
根据上面给出的示例数据,我的结果如下所示:
| ID1 | ID2 | ID3 | ID4 | MaxDate | AvgCost |
|-----+-----+-----+-----+-----------+---------|
| 200 | 125 | 300 | 201 | 1/1/19 | 0.20 |
( SELECT ID1, ID2, ID3, ID4, MAX(Date) as maxDate, AVG(Cost) as avgCost
FROM Table2 t2
GROUP BY ID1, ID2, ID3, ID4 ) t2
当它看起来应该是这样的时候
| ID1 | ID2 | ID3 | ID4 | MaxDate | AvgCost |
|-----+-----+-----+-----+-----------+---------|
| 200 | 125 | 300 | 201 | 1/1/19 | 0.33 |
平均成本包括date
未达到最大值的值。我假设这是由于AVG(cost)
在Table2
被MAX(Date)
过滤之前执行了AVG(cost)
。以下是我尝试过的:
SELECT t1.ID1, t1.ID2, t1.ID3, t1.ID4, maxDate, avgCost
FROM Table1 t1
JOIN ( SELECT ID1, ID2, ID3, ID4, MAX(Date) as maxDate, AVG(Cost) as avgCost
FROM Table2 t2
GROUP BY ID1, ID2, ID3, ID4 ) t2
ON t2.ID1 = t1.ID1
AND t2.ID2 = t1.ID2
AND t2.ID3 = t1.ID3
AND t2.ID4 = t1.ID4
WHERE maxDate = (SELECT MAX(Date) from Table2);
及
第一个没有结果,第二个结果是错误,ORA-01427:单行子查询返回多行
。我尝试过的其他方法基本上是上述方法的变体,但我仍然没有得到预期的结果。我不知道如何使AVG
功能仅在Date
达到最大值时运行。试试这个,可能不是最佳解决方案,但您可以尝试它以获得更好的性能
WITH MAXDATE AS (
SELECT MAX(DATE) AS MAXDATE,
ID1,
ID2,
ID3,
ID4
FROM TABLE2
GROUP BY ID1,ID2,ID3,ID4
)
SELECT ID1, ID2, ID3, ID4, MAX(DATE), AVG(COST)
FROM TABLE1 T1 JOIN TABLE2 T2
ON T1.ID1 = T2.ID1
ON T1.ID2 = T2.ID2
ON T1.ID3 = T2.ID3
ON T1.ID4 = T2.ID4 JOIN MAXDATE T3
ON T1.ID1 = T3.ID1
ON T1.ID2 = T3.ID2
ON T1.ID3 = T3.ID3
ON T1.ID4 = T3.ID4
ON T2.DATE = T3.MAXDATE
GROUP BY T1.ID1
T1.ID2
T1.ID3
T1.ID4
你似乎想要:
select id1, id2, id3, id4, date, avg(cost)
from (select t2.*,
dense_rank() over (partition by id1, id2, id3, id4 order by date desc) as seqnum
from table2 t2
) t2
where seqnum = 1
group by id1, id2, id3, id4, date;
densite\u rank()
按日期以相反顺序枚举值——在单个排名中计算关系。因此,最近日期的值为1
。其中seqnum=1
然后只选择最近的日期
table1
仅当您要筛选结果时才需要。你不妨这样做:
select id1, id2, id3, id4, date, avg(cost)
from (select t2.*,
dense_rank() over (partition by id1, id2, id3, id4 order by date desc) as seqnum
from table2 t2
where (id1, id2, id3, id4) in (select id1, id2, id3, id4 from table1)
) t2
where seqnum = 1
group by id1, id2, id3, id4, date;
您对t2
的定义如下:
| ID1 | ID2 | ID3 | ID4 | MaxDate | AvgCost |
|-----+-----+-----+-----+-----------+---------|
| 200 | 125 | 300 | 201 | 1/1/19 | 0.20 |
( SELECT ID1, ID2, ID3, ID4, MAX(Date) as maxDate, AVG(Cost) as avgCost
FROM Table2 t2
GROUP BY ID1, ID2, ID3, ID4 ) t2
相反,要仅计算最近日期的平均值,则应使用不同的聚合函数-上次的函数,如下所示:
( SELECT ID1, ID2, ID3, ID4, MAX(Date) as maxDate,
AVG(Cost) KEEP (DENSE_RANK LAST ORDER BY Date) as avgCost
FROM Table2 t2
GROUP BY ID1, ID2, ID3, ID4 ) t2
在您的数据样本中没有t1.ID4值。。为什么???@scaisEdge抱歉,根据表2中的数据示例,您不能在两行中删除。。您确定数据是相同的,并且没有隐藏的字符吗。。返回2个聚集rows@scaisEdge好的,我简化了比较。第四个ID实际上是一个字符串,比较是SUBSTR(t1.ID4,1,5)=SUBSTR(t2.ID4,1,5)
。确实,如果表2中成本为0.00
,则ID4
中有6个字符,而其余的只有5个。但是这不应该与SUBSTR(t2.ID4,1,5)
相协调吗?你应该有0.2
和AvgCost
结果。是的,这解决了它。查看了一些有关densite\u rank()
和分区的信息,这似乎是有意义的。你能解释一下seqnum=1的在做什么吗?-1是因为我使用了一个不需要的分析函数。正确的解决方案是使用aggregateLAST
函数-查询变得更加高效。