SQL OVER()子句-何时以及为什么它有用?

SQL OVER()子句-何时以及为什么它有用?,sql,sql-server,aggregate-functions,clause,Sql,Sql Server,Aggregate Functions,Clause,我读到了那个条款,我不明白为什么我需要它。 该函数的作用是什么?按分区做什么? 为什么我不能通过编写按SalesOrderID分组进行查询?子句的强大之处在于,无论您是否使用按分组,您都可以在不同的范围内进行聚合(“窗口化”) 示例:获取每个SalesOrderID的计数和所有的计数 USE AdventureWorks2008R2; GO SELECT SalesOrderID, ProductID, OrderQty ,SUM(OrderQty) OVER(PARTITION

我读到了那个条款,我不明白为什么我需要它。 该函数的作用是什么?
分区做什么?
为什么我不能通过编写
按SalesOrderID分组进行查询?

子句的强大之处在于,无论您是否使用
分组,您都可以在不同的范围内进行聚合(“窗口化”)

示例:获取每个
SalesOrderID的计数和所有的计数

    USE AdventureWorks2008R2;
GO
SELECT SalesOrderID, ProductID, OrderQty
    ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'
    ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'
    ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'
    ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max'
FROM Sales.SalesOrderDetail 
WHERE SalesOrderID IN(43659,43664);
获取不同的
计数
s,无
分组依据

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) AS 'Count'
    ,COUNT(*) OVER () AS 'CountAll'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)
GROUP BY
     SalesOrderID, ProductID, OrderQty

OVER子句在与PARTITION BY state组合时表示,前面的函数调用必须通过计算查询返回的行来分析完成。将其视为一个内联GROUPBY语句

OVER(按SalesOrderID划分)
说明了对于SUM、AVG等。。。函数,返回查询返回的记录子集上的值,并按外键SalesOrderID对该子集进行分区

因此,我们将对每个唯一SalesOrderID的每个OrderQty记录求和,该列名将被称为“Total”

这是一种比使用多个内联视图查找相同信息更有效的方法。您可以将此查询放在内联视图中,然后根据Total进行筛选

SELECT
    SalesOrderID, ProductID, OrderQty
    ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'
    ,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',
    ,COUNT(*) OVER () AS 'CountAllAgain'
FROM Sales.SalesOrderDetail 
WHERE
     SalesOrderID IN(43659,43664)
选择。。。,
从(您的查询)inlineview
其中总计<200

如果只想按SalesOrderID进行分组,则无法在SELECT子句中包含ProductID和OrderQty列

PARTITION BY子句允许您分解聚合函数。一个明显且有用的示例是,如果您希望为订单上的订单行生成行号:

SELECT ...,
FROM (your query) inlineview
WHERE Total < 200
(我的语法可能有点错误)

然后你会得到类似的结果:

SELECT
    O.order_id,
    O.order_date,
    ROW_NUMBER() OVER(PARTITION BY O.order_id) AS line_item_no,
    OL.product_id
FROM
    Orders O
INNER JOIN Order_Lines OL ON OL.order_id = O.order_id
您可以使用
按SalesOrderID分组
。不同之处在于,对于GROUP BY,您只能拥有GROUP BY中未包含的列的聚合值

相反,使用窗口聚合函数而不是GROUP BY,可以检索聚合值和非聚合值。也就是说,尽管您在示例查询中没有这样做,但您可以在相同的
SalesOrderID
s组上检索单个
OrderQty
值及其总和、计数、平均值等

下面是一个实际的例子,说明了为什么窗口聚合非常好。假设您需要计算每个值占总值的百分比。如果没有窗口聚合,您必须首先导出聚合值列表,然后将其连接回原始行集,即如下所示:

order_id    order_date    line_item_no    product_id
--------    ----------    ------------    ----------
    1       2011-05-02         1              5
    1       2011-05-02         2              4
    1       2011-05-02         3              7
    2       2011-05-12         1              8
    2       2011-05-12         2              1
现在看看如何对窗口聚合执行相同操作:

SELECT
  orig.[Partition],
  orig.Value,
  orig.Value * 100.0 / agg.TotalValue AS ValuePercent
FROM OriginalRowset orig
  INNER JOIN (
    SELECT
      [Partition],
      SUM(Value) AS TotalValue
    FROM OriginalRowset
    GROUP BY [Partition]
  ) agg ON orig.[Partition] = agg.[Partition]

更简单更干净,不是吗?

让我用一个例子解释一下,你就能看到它是如何工作的

假设您有下表DIM_设备:

SELECT
  [Partition],
  Value,
  Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent
FROM OriginalRowset orig
在SQL下面运行

VIN         MAKE    MODEL   YEAR    COLOR
-----------------------------------------
1234ASDF    Ford    Taurus  2008    White
1234JKLM    Chevy   Truck   2005    Green
5678ASDF    Ford    Mustang 2008    Yellow
结果如下

SELECT VIN,
  MAKE,
  MODEL,
  YEAR,
  COLOR ,
  COUNT(*) OVER (PARTITION BY YEAR) AS COUNT2
FROM DIM_EQUIPMENT
看看发生了什么

您可以不分组按年计数,并与行匹配

另一种获得相同结果的有趣方法是使用WITH子句,WITH作为内嵌视图工作,可以简化查询,特别是复杂的查询,但这里不是这样,因为我只是想展示用法

VIN         MAKE    MODEL   YEAR    COLOR     COUNT2
 ----------------------------------------------  
1234JKLM    Chevy   Truck   2005    Green     1
5678ASDF    Ford    Mustang 2008    Yellow    2
1234ASDF    Ford    Taurus  2008    White     2
  • 也称为
    查询请求
    子句
  • 类似于
    分组依据
    子句

    • 将数据分成块(或分区)
    • 按分区界限分开
    • 函数在分区内执行
    • 跨越分界线时重新初始化
语法:
函数(…)在(col1 col3,…)上的划分

  • 功能

    • 熟悉的函数,如
      COUNT()
      SUM()
      MIN()
      MAX()
      ,等等
    • 新函数(例如
      行数()
      对行数()的比率()
      ,等等)

有关示例的更多信息:

这是查询的结果。用作源的表是同一个exept,它没有最后一列。此列是第三列的移动和

查询:

prkey   whatsthat               cash   
890    "abb                "   32  32
43     "abbz               "   2   34
4      "bttu               "   1   35
45     "gasstuff           "   2   37
545    "gasz               "   5   42
80009  "hoo                "   9   51
2321   "ibm                "   1   52
998    "krk                "   2   54
42     "kx-5010            "   2   56
32     "lto                "   4   60
543    "mp                 "   5   65
465    "multipower         "   2   67
455    "O.N.               "   1   68
7887   "prem               "   7   75
434    "puma               "   3   78
23     "retractble         "   3   81
242    "Trujillo's stuff   "   4   85
(表为public.iuk)

这比dbase(1986)的水平高了一点,我不知道为什么需要25年以上的时间来完成它

简单地说: Over子句可用于选择非聚合值和聚合值

划分、内部按排序,以及行或范围是OVER()BY子句的一部分

partition by用于对数据进行分区,然后执行这些窗口聚合函数,如果没有partition by,则整个结果集被视为单个分区

Select *,sum(salary) Over(order by salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as sum_sal from employees

Id          Name                                               Gender     Salary      sum_sal
----------- -------------------------------------------------- ---------- ----------- -----------
1           Mark                                               Male       5000        62000
2           John                                               Male       4500        62000
3           Pavan                                              Male       5000        62000
4           Pam                                                Female     5500        62000
5           Sara                                               Female     4000        62000
6           Aradhya                                            Female     3500        62000
7           Tom                                                Male       5500        62000
8           Mary                                               Female     5000        62000
9           Ben                                                Male       6500        62000
10          Jodi                                               Female     7000        62000
11          Tom                                                Male       5500        62000
12          Ron                                                Male       5000        62000
OVER子句可以与排序函数(秩、行数、密集秩..)、聚合函数(平均值、最大值、最小值、和…等)和分析函数(第一个值、最后一个值和其他几个)一起使用

让我们看看OVER子句的基本语法

sql version:  2012
所以,让我执行不同的场景,看看数据是如何受到影响的,我将从困难的语法变成简单的语法

Id          Name                                               Gender     Salary
----------- -------------------------------------------------- ---------- -----------
1           Mark                                               Male       5000
2           John                                               Male       4500
3           Pavan                                              Male       5000
4           Pam                                                Female     5500
5           Sara                                               Female     4000
6           Aradhya                                            Female     3500
7           Tom                                                Male       5500
8           Mary                                               Female     5000
9           Ben                                                Male       6500
10          Jodi                                               Female     7000
11          Tom                                                Male       5500
12          Ron                                                Male       5000
只需观察总结部分。这里我使用的是按工资排序,并使用“前一行和当前行之间的无限范围”
。 在这种情况下,我们不使用分区,所以整个数据将被视为一个分区,并且我们根据薪水进行排序。 这里重要的是无界的前一行和当前行。这意味着当我们计算总和时,从每行的起始行到当前行。 但如果我们看到的行中有salary 5000和name=“Pavan”,理想情况下应该是17000,对于salary=5000和name=Mark,应该是22000。但是,当我们使用范围时,在本例中,如果它发现任何类似的元素,那么它将它们视为同一逻辑组,并对它们执行操作,并为该组中的每个项目赋值。这就是为什么我们的薪水=50的原因
OVER (   
       [ <PARTITION BY clause> ]  
       [ <ORDER BY clause> ]   
       [ <ROW or RANGE clause> ]  
      )  
Id          Name                                               Gender     Salary
----------- -------------------------------------------------- ---------- -----------
1           Mark                                               Male       5000
2           John                                               Male       4500
3           Pavan                                              Male       5000
4           Pam                                                Female     5500
5           Sara                                               Female     4000
6           Aradhya                                            Female     3500
7           Tom                                                Male       5500
8           Mary                                               Female     5000
9           Ben                                                Male       6500
10          Jodi                                               Female     7000
11          Tom                                                Male       5500
12          Ron                                                Male       5000
Select *,SUM(salary) Over(order by salary RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as sum_sal from employees

Id          Name                                               Gender     Salary      sum_sal
----------- -------------------------------------------------- ---------- ----------- -----------
6           Aradhya                                            Female     3500        3500
5           Sara                                               Female     4000        7500
2           John                                               Male       4500        12000
3           Pavan                                              Male       5000        32000
1           Mark                                               Male       5000        32000
8           Mary                                               Female     5000        32000
12          Ron                                                Male       5000        32000
11          Tom                                                Male       5500        48500
7           Tom                                                Male       5500        48500
4           Pam                                                Female     5500        48500
9           Ben                                                Male       6500        55000
10          Jodi                                               Female     7000        62000
Select *,SUM(salary) Over(order by salary ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as sum_sal from employees


   Id          Name                                               Gender     Salary      sum_sal
----------- -------------------------------------------------- ---------- ----------- -----------
6           Aradhya                                            Female     3500        3500
5           Sara                                               Female     4000        7500
2           John                                               Male       4500        12000
3           Pavan                                              Male       5000        17000
1           Mark                                               Male       5000        22000
8           Mary                                               Female     5000        27000
12          Ron                                                Male       5000        32000
11          Tom                                                Male       5500        37500
7           Tom                                                Male       5500        43000
4           Pam                                                Female     5500        48500
9           Ben                                                Male       6500        55000
10          Jodi                                               Female     7000        62000
Select *,SUM(salary) Over(order by salary) as sum_sal from employees

Id          Name                                               Gender     Salary      sum_sal
----------- -------------------------------------------------- ---------- ----------- -----------
6           Aradhya                                            Female     3500        3500
5           Sara                                               Female     4000        7500
2           John                                               Male       4500        12000
3           Pavan                                              Male       5000        32000
1           Mark                                               Male       5000        32000
8           Mary                                               Female     5000        32000
12          Ron                                                Male       5000        32000
11          Tom                                                Male       5500        48500
7           Tom                                                Male       5500        48500
4           Pam                                                Female     5500        48500
9           Ben                                                Male       6500        55000
10          Jodi                                               Female     7000        62000
Select *, SUM(salary) Over(order by salary RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as sum_sal from employees
Select *,sum(salary) Over(order by salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as sum_sal from employees

Id          Name                                               Gender     Salary      sum_sal
----------- -------------------------------------------------- ---------- ----------- -----------
1           Mark                                               Male       5000        62000
2           John                                               Male       4500        62000
3           Pavan                                              Male       5000        62000
4           Pam                                                Female     5500        62000
5           Sara                                               Female     4000        62000
6           Aradhya                                            Female     3500        62000
7           Tom                                                Male       5500        62000
8           Mary                                               Female     5000        62000
9           Ben                                                Male       6500        62000
10          Jodi                                               Female     7000        62000
11          Tom                                                Male       5500        62000
12          Ron                                                Male       5000        62000
Select *,Sum(salary) Over() as sum_sal from employees

Id          Name                                               Gender     Salary      sum_sal
----------- -------------------------------------------------- ---------- ----------- -----------
1           Mark                                               Male       5000        62000
2           John                                               Male       4500        62000
3           Pavan                                              Male       5000        62000
4           Pam                                                Female     5500        62000
5           Sara                                               Female     4000        62000
6           Aradhya                                            Female     3500        62000
7           Tom                                                Male       5500        62000
8           Mary                                               Female     5000        62000
9           Ben                                                Male       6500        62000
10          Jodi                                               Female     7000        62000
11          Tom                                                Male       5500        62000
12          Ron                                                Male       5000        62000