Sql 查询以从每个组中查找第二大值

Sql 查询以从每个组中查找第二大值,sql,postgresql,postgresql-9.3,Sql,Postgresql,Postgresql 9.3,我有三张桌子: project:project\u id,project\u name 里程碑:里程碑id、里程碑名称 项目里程碑:id、项目id、里程碑id、完成日期 我想从按项目id分组的项目里程碑中获取第二高的完成日期和里程碑id。也就是说,我想获取每个项目第二高的完成日期的里程碑id。对此,正确的查询是什么?我认为您可以使用项目里程碑表和行编号(): 如果您需要包括所有项目,即使是没有两个里程碑的项目,也可以使用左连接: select p.project_id, pm.milestone

我有三张桌子:

  • project:project\u id,project\u name
  • 里程碑:里程碑id、里程碑名称
  • 项目里程碑:id、项目id、里程碑id、完成日期

  • 我想从按项目id分组的项目里程碑中获取第二高的完成日期和里程碑id。也就是说,我想获取每个项目第二高的完成日期的里程碑id。对此,正确的查询是什么?

    我认为您可以使用
    项目里程碑
    表和
    行编号()

    如果您需要包括所有项目,即使是没有两个里程碑的项目,也可以使用
    左连接

    select p.project_id, pm.milestone_id, pm.completed_date
    from projects p left join
         (select pm.*,
                 row_number() over (partition by project_id order by completed_date desc) as seqnum
          from project_milestone pm
          where pm.completed_date is not null
         ) pm
         on p.project_id = pm.project_id and pm.seqnum = 2;
    
    使用横向(PG 9.3+)可以产生比窗口功能版本更好的性能

    SELECT * FROM project;
     project_id | project_name 
    ------------+--------------
              1 | Project A
              2 | Project B
    
    SELECT * FROM project_milestone;
     id | project_id | milestone_id |     completed_date     
    ----+------------+--------------+------------------------
      1 |          1 |            1 | 2000-01-01 00:00:00+01
      2 |          1 |            2 | 2000-01-02 00:00:00+01
      3 |          1 |            5 | 2000-01-03 00:00:00+01
      4 |          1 |            6 | 2000-01-04 00:00:00+01
      5 |          2 |            3 | 2000-02-01 00:00:00+01
      6 |          2 |            4 | 2000-02-02 00:00:00+01
      7 |          2 |            7 | 2000-02-03 00:00:00+01
      8 |          2 |            8 | 2000-02-04 00:00:00+01
    
    
    SELECT *
    FROM project p
    CROSS JOIN LATERAL (
        SELECT milestone_id, completed_date
        FROM project_milestone pm
        WHERE pm.project_id = p.project_id
        ORDER BY completed_date ASC
        LIMIT 1
        OFFSET 1
    ) second_highest;
     project_id | project_name | milestone_id |     completed_date     
    ------------+--------------+--------------+------------------------
              1 | Project A    |            2 | 2000-01-02 00:00:00+01
              2 | Project B    |            4 | 2000-02-02 00:00:00+01
    

    实现这一点的最简单方法是使用window函数

    SELECT *, nth_value(completed_date,2)
    OVER (
        PARTITION BY project_id ORDER BY completed_date DESC
        RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    )
    AS date2
    FROM project_milestone;
    
    

    谢谢。它正在工作,但是当完成日期中有空值时,它会超过日期值,并且顺序会改变。是否可以避免null valuesok。我通过在子查询中添加条件“where pm.completed_date不为null”解决了这个问题
    SELECT *, nth_value(completed_date,2)
    OVER (
        PARTITION BY project_id ORDER BY completed_date DESC
        RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    )
    AS date2
    FROM project_milestone;