统计postgresql矩阵中的列组合

统计postgresql矩阵中的列组合,sql,postgresql,matrix,combinations,Sql,Postgresql,Matrix,Combinations,我在博士后中有一张表,如下所示 我想要一个postgres中的sql,它计算两个具有YY的列的组合 期望像这样的输出 组合计数 AB 2 AC 1 AD 2 AZ 1 BC 1 BD 3 BZ 2 CD 2 CZ 0 DZ 1 有人能帮我吗 WITH stacked AS ( SELECT id , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name , unnest(array[a, b, c, d,

我在博士后中有一张表,如下所示

我想要一个postgres中的sql,它计算两个具有YY的列的组合

期望像这样的输出

组合计数

AB 2
AC 1
AD 2
AZ 1
BC 1
BD 3
BZ 2
CD 2
CZ 0
DZ 1
有人能帮我吗

WITH stacked AS (
    SELECT id
        , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name
        , unnest(array[a, b, c, d, z]) AS col_value
    FROM test t
)
SELECT combo, sum(cnt) AS count
FROM (
    SELECT t1.id, t1.col_name || t2.col_name AS combo
        , (CASE WHEN t1.col_value = 'Y' AND t2.col_value = 'Y' THEN 1 ELSE 0 END) AS cnt
    FROM stacked t1
    INNER JOIN stacked t2
    ON t1.id = t2.id
    AND t1.col_name < t2.col_name) t3
GROUP BY combo
ORDER BY combo
取消对桌子的注意的令人不安的方法来自

要在3列中计算YYY的发生率,可以使用:

WITH stacked AS (
    SELECT id
        , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name
        , unnest(array[a, b, c, d, z]) AS col_value
    FROM test t
)
SELECT combo, sum(cnt) AS count
FROM (
    SELECT t1.id, t1.col_name || t2.col_name || t3.col_name AS combo
        , (CASE WHEN t1.col_value = 'Y' 
               AND t2.col_value = 'Y'
               AND t3.col_value = 'Y' THEN 1 ELSE 0 END) AS cnt
    FROM stacked t1
    INNER JOIN stacked t2
    ON t1.id = t2.id
    INNER JOIN stacked t3
    ON t1.id = t3.id
    AND t1.col_name < t2.col_name 
    And t2.col_name < t3.col_name
    ) t3
GROUP BY combo
ORDER BY combo
;
或者,要处理N列的组合,您可以使用递归: 例如,对于N=3


请注意,在上面的SQL中,N=3用于两个位置。

我将使用横向联接:

with vals as (
      select v.*
      from t cross join lateral
           (values ('A', A), ('B', B), ('C', C), ('D', D), ('Z', Z)
           ) v(which, val)
     )
select (v1.which || v2.which) as combo,
       sum( (val = 'Y')::int ) as count
from vals v1 join
     vals v2
     on v1.which < v2.which
group by combo
order by combo;

我认为横向连接是一种更直接的方法来解开这些值。不需要将值转换为一个数组,更不用说两个数组,并对齐值。

谢谢你,Ubuntu。这正是我想要的。精彩的你知道如何把它组合成3列吗?像YYY。只是好奇。提前谢谢你!明亮的N的情况适用于所有情况。非常感谢。
| combo | count |
|-------+-------|
| ABC   |     0 |
| ABD   |     1 |
| ABZ   |     2 |
| ACD   |     1 |
| ACZ   |     0 |
| ADZ   |     1 |
| BCD   |     1 |
| BCZ   |     0 |
| BDZ   |     1 |
| CDZ   |     0 |
WITH RECURSIVE result AS (
    WITH stacked AS (
        SELECT id
            , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name
            , unnest(array[a, b, c, d, z]) AS col_value
        FROM test t)
    SELECT id, array[col_name] AS path, array[col_value] AS path_val, col_name AS last_name
    FROM stacked

    UNION

    SELECT r.id, path || s.col_name, path_val || s.col_value, s.col_name
    FROM result r
    INNER JOIN stacked s
    ON r.id = s.id
        AND s.col_name > r.last_name
    WHERE array_length(r.path, 1) < 3)  -- Change 3 to your value for N
SELECT combo, sum(cnt)
FROM (
    SELECT id, array_to_string(path, '') AS combo, (CASE WHEN 'Y' = all(path_val) THEN 1 ELSE 0 END) AS cnt
    FROM result
    WHERE array_length(path, 1) = 3) t  -- Change 3 to your value for N
GROUP BY combo
ORDER BY combo
with vals as (
      select v.*
      from t cross join lateral
           (values ('A', A), ('B', B), ('C', C), ('D', D), ('Z', Z)
           ) v(which, val)
     )
select (v1.which || v2.which) as combo,
       sum( (val = 'Y')::int ) as count
from vals v1 join
     vals v2
     on v1.which < v2.which
group by combo
order by combo;