Sql 整个表中行中的值组合计数
在表格中使用以下数据:Sql 整个表中行中的值组合计数,sql,postgresql,Sql,Postgresql,在表格中使用以下数据: | name | d1 | d2 | d3 | d4 | d5 | d6 | d7 | d8 | |--------|-------|--------|--------|--------|--------|--------|--------|--------| | matty | 116.7 | 17.88 | 16.1 | 9.731 | (null) | (null) | (null) | (
| name | d1 | d2 | d3 | d4 | d5 | d6 | d7 | d8 |
|--------|-------|--------|--------|--------|--------|--------|--------|--------|
| matty | 116.7 | 17.88 | 16.1 | 9.731 | (null) | (null) | (null) | (null) |
| jana | 17.88 | 116.7 | 65.45 | 72.1 | (null) | (null) | (null) | (null) |
| chris | 72.1 | (null) | (null) | (null) | (null) | (null) | (null) | (null) |
| khaled | 9.731 | 116.7 | 17.88 | 53.1 | 2 | 85.2 | (null) | (null) |
| " | " | " | " | " | " | " | " | " |
| n | " | " | " | " | " | " | " | " |
如何识别值组合在SQL中所有行中出现的次数
以下是所需的输出示例:
116.7、17.88(3)
116.7、17.88、9.731(2)
72.1(2)
16.1(1)
65.45(1)
53.1(1)
2(1)
85.2(1)
如果SQL不可能,那么任何替代方法都可以做到这一点?在下面的例子中,我并没有想象
d1
,d2
的不同组合。如果两者相同,您将得到一个计数为2
的值
因此,假设列的数量是有限且固定的,那么您可以借助union来实现
select concat(array_to_string(array_agg(col),',') ,' (', cnt ,')' ) as result
from
(
select col,count(*) cnt
from
( select d1 as col from table1
union all
select d2 from table1
union all
select d3 from table1
--similarly add other columns
) t
where col is not null
group by col
) t1
group by cnt
order by cnt desc;
输出
result
--------------------------
17.88,116.7 (3)
72.1,16.1,65.45,9.731 (1)
否则,您必须创建一个过程来获取一个联合中的所有列,然后像上面那样进行分组和计数。PostgreSQL中没有用于组合计算的内置函数,但您可以为它编写一个函数,例如:
create or replace function combinations(variadic anyarray)
returns setof anyarray
language sql
immutable
called on null input
as $func$
with recursive e as (
select *
from unnest($1) with ordinality u(e, o)
where e is not null
),
r as (
select distinct on (e) array[e] ea, array[o] oa
from e
union all
select distinct on (oea) oea, oa || o
from r, e, lateral (select array_agg(u order by u) oea from unnest(ea || e) u) l
where o <> all(oa)
)
select ea
from r
$func$;
但是,示例输入中的组合将比示例输出中包含的组合多得多。(也许你只是为了保留空间而忽略了它们?)
注:
- 上面的函数使用数量可变的参数,这些参数被转换为本机PostgreSQL数组(因为)
- 它接受任何类型的输入,只要它们都是相同类型的(因为)。这就是所谓的政治形式主义。另外,由于
,它将返回相同数组类型的完整结果集(多行)返回anyarray的setof
只是简化了函数体:它不会包含任何高级过程语言构造,比如language sql
或IF
(LOOP
可以包含这些)language plpgsql
- with
别名从输入数组中展开数据,但在e
字段中保留排序/索引信息(请参阅)。这一点在后面将非常重要,因为我们不能使用值本身来删除重复(即,o
应该是有效的组合,正如您前面所评论的)(2,2)
s在此处被丢弃NULL
- 带有
别名的递归CTE(因此r
关键字在recursive
之后)将累加每个组合。它从每一个值开始。然后在每一步中,它都会附加一个元素,其中包含原始集合中的另一个序号(索引)(请参见with
)。因为元素在组合中的顺序无关紧要(正如您所评论的),所以我在子查询中对元素进行了排序。另外,这两个递归查询部分都使用其中o all(oa)
来删除任何可能的重复,当多个元素具有相同的值时可能会发生这种情况distinct on()
- 解决方案查询使用隐式联接来计算每行的每个组合。这一步将使表的原始行与它们的组合相乘。然后,我们只需要使用
和按组合分组
每个组合计数(*)
sql的替代方案?这些数据从哪里来?如果f.ex<代码>d1=d2?在这种情况下,
(d1,d2)
对是有效的组合,还是仅仅是(d1)
?@pozs是的,在这种情况下,d1,d2是有效的组合,对的设置顺序无关紧要。如中所示,(d1,d2)=(d2,d1)不应9.731(2)
也在结果中?。Tsun它是由@pozs创建的函数在结果中这就是它!你是最棒的!是的,我没有把它们包括在内以保留空间。你能解释一下这个功能是如何工作的吗?@Dabbous更新了详细的工作说明。
select combinations, count(*)
from table_name
cross join combinations(d1, d2, d3, d4, d5, d6, d7, d8)
group by 1
with CTE_001 as (
SELECT name,D1 AS XVAL FROM mytable2 WHERE D1 IS NOT NULL
UNION ALL
SELECT name,D2 FROM mytable2 WHERE D2 IS NOT NULL
UNION ALL
SELECT name,D3 FROM mytable2 WHERE D3 IS NOT NULL
UNION ALL
SELECT name,D4 FROM mytable2 WHERE D4 IS NOT NULL
UNION ALL
SELECT name,D5 FROM mytable2 WHERE D5 IS NOT NULL
UNION ALL
SELECT name,D6 FROM mytable2 WHERE D6 IS NOT NULL
UNION ALL
SELECT name,D7 FROM mytable2 WHERE D7 IS NOT NULL
UNION ALL
SELECT name,D8 FROM mytable2 WHERE D8 IS NOT NULL
)
SELECT CONCAT(XVAL1, ', ', XVAL2) AS LOV, COUNT(*) AS RC
FROM(
SELECT C1.NAME, C1.XVAL AS XVAL1, C2.XVAL AS XVAL2
FROM CTE_001 C1
INNER JOIN CTE_001 C2 ON C1.NAME = C2.NAME
WHERE C1.XVAL < C2.XVAL
) B
GROUP BY XVAL1, XVAL2
HAVING COUNT(*) >1
UNION ALL
SELECT CONCAT(XVAL1, ', ' , XVAL2,', ', XVAL3), COUNT(*) AS RC
FROM(
SELECT C1.NAME, C1.XVAL AS XVAL1, C2.XVAL AS XVAL2, C3.XVAL AS XVAL3
FROM CTE_001 C1
INNER JOIN CTE_001 C2 O
N C1.NAME = C2.NAME
INNER JOIN CTE_001 C3 ON C1.NAME = C3.NAME
WHERE C1.XVAL < C2.XVAL AND C1.XVAL < C3.XVAL AND C2.XVAL < C3.XVAL
) B
GROUP BY XVAL1, XVAL2, XVAL3
HAVING COUNT(*) >1
ORDER BY 2 DESC
lov rc
1 17.880, 116.700 3
2 9.731, 116.700 2
3 9.731, 17.880 2
4 9.731, 17.880, 116.700 2