sql按子组分组_Sql - Fatal编程技术网

sql按子组分组

sql

sql按子组分组,sql,Sql,我正在努力找出一个群体中的大多数子群体。例如，假设我的表如下所示： +--------------------------------------------------+ | city | car_colour | car_type | qty | +--------------------------------------------------+ | ------------------------------------------------ | | manc

我正在努力找出一个群体中的大多数子群体。例如，假设我的表如下所示：

+--------------------------------------------------+
|   city      |  car_colour |  car_type |  qty   |
+--------------------------------------------------+
| ------------------------------------------------ |
| manchester  |  Red        |  Sports   |  7       |
| manchester  |  Red        |  4x4      |  9       |
| manchester  |  Blue       |  4x4      |  8       |
| london      |  Red        |  Sports   |  2       |
| london      |  Blue       |  4x4      |  3       |
| leeds       |  Red        |  Sports   |  5       |
| leeds       |  Blue       |  Sports   |  6       |
| leeds       |  Blue       |  4X4      |  1       |
+--------------------------------------------------+

我试图找到一个纯sql的解决方案，这样我就可以看到：在每个城市，哪种颜色的汽车数量最多

我可以做到：

select city, cars, sum(qty)
from table
group by city, cars

要获得：

+------------+------+----+
| manchester | red  | 16 |
| manchester | blue |  8 |
| london     | red  |  2 |
| london     | blue |  3 |
| leeds      | red  |  5 |
| leeds      | blue |  7 |
+------------+------+----+

但是，我是否可以使用子查询获得结果的最大值，该结果将返回每个城市的最大颜色，因此结果将显示：

+------------+------+
| manchester | red  |
| london     | blue |
| leeds      | blue |
+------------+------+

我可以在Python脚本中进行这些计算，但更喜欢纯SQL解决方案

希望这是有意义的，感谢您的帮助：）

汤米

这是可行的，但可能会根据您使用的特定数据库进行改进：

select t.city, t.car_colour, a.qty
from table1 t
join (
  select city, max(qty) qty 
  from (
    select city, car_colour, sum(qty) qty 
    from table1 
    group by city, car_colour
  ) x group by city
) a on t.city = a.city 
group by t.city, t.car_colour, a.qty
having sum(t.qty) = a.qty
order by t.city desc;

如果您使用MS SQL：

DECLARE @t TABLE
    (
      city NVARCHAR(MAX) ,
      color NVARCHAR(MAX) ,
      qty INT
    )

INSERT  INTO @t
VALUES  ( 'manchester', 'Red', 7 ),
        ( 'manchester', 'Red', 9 ),
        ( 'manchester', 'Blue', 8 ),
        ( 'london', 'Red', 2 ),
        ( 'london', 'Blue', 3 ),
        ( 'leeds', 'Red', 5 ),
        ( 'leeds', 'Blue', 6 ),
        ( 'leeds', 'Blue', 1 )


SELECT  city , color
FROM    ( SELECT    city ,
                    color ,
                    SUM(qty) AS q ,
                    ROW_NUMBER() OVER ( PARTITION BY city ORDER BY SUM(qty) DESC ) AS rn
          FROM      @t
          GROUP BY  city , color
        ) t
WHERE   rn = 1

输出：

city        color
leeds       Blue
london      Blue
manchester  Red

也许我做错了什么，但因为这个解决方案仍然是按城市和汽车分组的，我没有为每个城市获得一条记录，我仍然为每个城市的每种汽车颜色获得一条记录。我想找到每个城市子查询的最大值，看看这是哪种颜色。问题是，如果我在SELECT语句中包含q.car_color，则它必须是group by或AGGRATE。有没有办法让max（q.qty）取代我在q.car_color上使用的任何聚合函数？我真的很抱歉，我没有注意到，让我再次回顾并编辑我的answer@StanislovasKalašnikovas，我知道是的，你发布了一个错误的答案，请不要在我的广告下宣传你的错误答案answer@TommyGaboreau很抱歉，拼写错误已经纠正，我也测试了查询，请再次检查。@StanislovasKalašnikovas你看到它是如何像一个符咒一样工作的吗！！？？？这比你错误的解决方案要好得多，你编辑了OP的样本数据，并给出了一个答案：你使用哪种RDBMS？

city        color
leeds       Blue
london      Blue
manchester  Red