计算SQL中每个唯一列组合的行数_Sql_Postgresql_Aggregate Functions_Greatest N Per Group

计算SQL中每个唯一列组合的行数

sql postgresql

计算SQL中每个唯一列组合的行数,sql,postgresql,aggregate-functions,greatest-n-per-group,Sql,Postgresql,Aggregate Functions,Greatest N Per Group,我想从基于两列的表中返回一组唯一记录，以及最近的发布时间和这两列组合在其输出记录出现之前的总次数因此，我想得到的是以下几点： select col1, col2, max_posted, count from T join ( select col1, col2, max(posted) as posted from T where groupid = "XXX" group by col1, col2) h on ( T.col1 = h.col1 and T.col2 = h.c

我想从基于两列的表中返回一组唯一记录，以及最近的发布时间和这两列组合在其输出记录出现之前的总次数

因此，我想得到的是以下几点：

select col1, col2, max_posted, count from T
join (
 select col1, col2, max(posted) as posted  from T where groupid = "XXX" 
group by col1, col2) h
on ( T.col1 = h.col1 and
  T.col2 = h.col2 and
  T.max_posted = h.tposted)
where T.groupid = 'XXX'

Count需要是col1和col2的每个组合在输出中每条记录的max_过帐之前发生的次数。我希望我的解释是正确的：

编辑：在尝试以下建议时：

 select dx.*,
   count(*) over (partition by dx.cicd9, dx.cdesc order by dx.tposted) as   cnt
from dx
join (
select cicd9, cdesc, max(tposted) as tposted  from dx where groupid ="XXX" 
group by cicd9, cdesc) h
on ( dx.cicd9 = h.cicd9 and
  dx.cdesc = h.cdesc and
  dx.tposted = h.tposted)
where groupid =  'XXX';

计数始终返回“1”。此外，您如何仅统计tposted之前发生的记录

这也失败了，但我希望你能达到我的目的：

  WITH H AS (
    SELECT cicd9, cdesc, max(tposted) as tposted  from dx where groupid =  'XXX' 
    group by cicd9, cdesc), 
    J AS (
    SELECT  count(*) as cnt
    FROM dx, h
    WHERE dx.cicd9 = h.cicd9
      and dx.cdesc = h.cdesc
      and dx.tposted <= h.tposted
      and dx.groupid = 'XXX'
 )
SELECT H.*,J.cnt
FROM H,J

有人需要帮忙吗？

您只需要累计计数吗

select t.*,
       count(*) over (partition by col1, col2 order by posted) as cnt
from table t
where groupid = 'xxx';

这个怎么样：

SELECT DISTINCT ON (cicd9, cdesc) cicd9, cdesc,
  max(posted) OVER w AS last_post,
  count(*) OVER w AS num_posts
FROM dx
WHERE groupid = 'XXX'
WINDOW w AS (
  PARTITION BY cicd9, cdesc
  RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
);

由于缺少PG版本、表定义、数据和所需的输出，这只是一时兴起，但原则应该是可行的：在groupid='XXX'的两列上做一个分区，然后找到发布列的最大值和窗口框架中的行总数，从而确定范围。。。窗口定义中的子句。

这是我能想到的最好的方法-欢迎提供更好的建议

这将产生我所需要的结果，但要理解，从联接开始，计数将始终至少为1：

  SELECT dx.cicd9, dx.cdesc, max(dx.tposted), count(*)
from dx 
join (
SELECT cicd9, cdesc, max(tposted) as tposted  from dx where groupid   =  'XXX' 
    group by cicd9, cdesc) h
on 
  (dx.cicd9 = h.cicd9 and dx.cdesc = h.cdesc and dx.tposted <= h.tposted 
  and dx.groupid = 'XXX')
group by dx.cicd9, dx.cdesc
order by dx.cdesc;

或

这令人困惑：

Count需要是col1和col1的每次组合的次数 col2发生在输出中每条记录的max_过帐之前

因为，根据定义，每个记录都在最新帖子之前或同时出现，这本质上意味着每个组合的总计数忽略了句子中一个错误的假设

因此，这可以归结为一个简单的组：

这与当前接受的答案完全相同。只是更快更简单。

样本数据和期望的结果将有助于澄清问题。是的。只需要对过帐前发生的记录进行累计计数。请参阅我的上述编辑。谢谢。我想补充一点，我正在寻找与输出行匹配的行总数，包括重复的行。您的问题仍然模糊。您应该添加示例数据和所需结果。这里有很多要学习的内容，如果这包括所有行，那么它可能会起作用。。当我没有脑死亡的时候，我需要研究一下。谢谢PG版本9.3如何更改查询以使每个cicd9、cdesc组只生成一行？它当前为dx中找到的每一行复制相同的输出行。谢谢你给我的所有新概念。@AlanWayne:添加了一个独特的ON子句，请参阅更新的答案。是的！成功了。非常感谢您，先生。它是否还您需要的东西？如果没有，它是如何失败的？如果无法用文字说明任务，则应在问题中添加有意义的示例值和所需结果。@ErwinBrandstetter Yes。如果count=0就更好了，因为我希望对以前日期的行进行计数。否则，它工作得很好。

 WITH H AS (
    SELECT cicd9, cdesc, max(tposted) as tposted  from dx where groupid =  'XXX' 
    group by cicd9, cdesc)  
SELECT dx.cicd9, dx.cdesc, max(dx.tposted), count(*)
from dx, H
where dx.cicd9 = h.cicd9 and dx.cdesc = h.cdesc and dx.tposted <= h.tposted 
  and dx.groupid = 'XXX'
group by dx.cicd9, dx.cdesc
order by cdesc;

SELECT cicd9, cdesc
     , max(posted) AS last_posted
     , count(*)    AS ct
FROM   dx
WHERE  groupid = 'XXX'
GROUP  BY 1, 2
ORDER  BY 1, 2;