Sql 使用group by从两列中选择unique
我有一张桌子:Sql 使用group by从两列中选择unique,sql,postgresql,Sql,Postgresql,我有一张桌子: CREATE TABLE stats_test ( id1 bigint, id2 bigint, date timestamp with time zone ); 和内部数据: id1 | id2 | date -----+-----+------------------------ 1 | 2 | 2020-12-01 00:00:00+00 2 | 1 | 2020-12-01 00:00:00+00 3
CREATE TABLE stats_test
(
id1 bigint,
id2 bigint,
date timestamp with time zone
);
和内部数据:
id1 | id2 | date
-----+-----+------------------------
1 | 2 | 2020-12-01 00:00:00+00
2 | 1 | 2020-12-01 00:00:00+00
3 | 4 | 2020-11-01 00:00:00+00
4 | 3 | 2020-11-01 00:00:00+00
1 | 3 | 2020-12-01 00:00:00+00
1 | 3 | 2020-11-01 00:00:00+00
通过此查询,我得到以下结果:
SELECT EXTRACT(YEAR FROM date), EXTRACT(MONTH FROM date),
COUNT(DISTINCT id1) AS unique_id1, COUNT(DISTINCT id2) AS unique_id2
FROM stats_test GROUP BY EXTRACT(YEAR FROM date), EXTRACT(MONTH FROM date);
date_part | date_part | unique_id1 | unique_id2
-----------+-----------+------------+------------
2020 | 11 | 3 | 2
2020 | 12 | 2 | 3
如何从按年份和月份分组的两个列(id1、id2)集合中获取另一个具有count唯一ID的列
date_part | date_part | unique_id1 | unique_id2 | unique_both_ids
-----------+-----------+------------+------------+----------------
2020 | 11 | 3 | 2 |
2020 | 12 | 2 | 3 |
count(distinct..)
只允许一个表达式(因此count(distinct id1,id2)
被拒绝),但您可以使用匿名行表达式来克服该限制:
select extract(year from date) as year,
extract(month from date) as month,
count(distinct id1) as unique_id1,
count(distinct id2) as unique_id2,
count(distinct (id1,id2)) as unique_both_ids
from stats_test
group by extract(year from date), extract(month from date);
请注意,1,2和2,1将被视为两个不同的事物。如果您想让它们得到相同的处理,请使用:
count(distinct(最少(id1,id2),最大(id1,id2))
但是,我不确定它是否按我想要的方式工作。因为我的英语不好,我写得不好。我的重点是从对ID(1,2),(2,1),(3,1),(1,3),(4,1),(4,3),(3,4),(3,2)等中获取,COUNT=4。因为在所有这些组合中有四个唯一的ID