Sql 动态联接数据透视表

Sql 动态联接数据透视表,sql,postgresql,Sql,Postgresql,也许社区可以就PostgresQL9.5这个问题向我提供建议 有一个大的150万行flightlog表,其中有列:action type 4 action type、timestamp和user_id。 用户表有6K行 **flightlog** user_id, time, action 2301 "2016-10-25 14:13:25.74668" "View" 8 "2016-04-25 15:02:13.916204" "Download" 8 "2016-04-2


有一个大的150万行flightlog表,其中有列:action type 4 action type、timestamp和user_id。 用户表有6K行

user_id, time, action
2301    "2016-10-25 14:13:25.74668" "View"
8   "2016-04-25 15:02:13.916204"    "Download"
8   "2016-04-25 15:01:20.553475"    "Download"
8   "2016-04-25 14:57:02.430493"    "Download"
8   "2016-04-25 14:57:02.160002"    "Download"
8   "2016-04-25 14:57:01.397602"    "Download"
26  "2016-10-25 16:01:25.005285"    "View"
216 "2016-10-24 14:46:16.035242"    "View"
2182    "2016-10-24 14:47:43.713"   "View"
243 "2016-10-24 12:10:12.187181"    "View"
26  "2016-10-24 15:01:26.269981"    "View"
26  "2016-10-24 15:01:28.122361"    "View"

user_id, email
8 "ndoe@mysite.com"
26  "jdoe@mysite.com"
2301 "kdoe@mysite.com"

user_id, expires
8    "2017-08-30 15:48:06.827258"
26   "2017-08-10 00:00:00"
2301 "2017-09-28 09:09:17.56549"
我需要有一个统计表,每个用户每月4次不同操作的计数, 因此,这将是用户,然后每月4个行动,这是重复12次。 这些列的外观如下所示:

user1  period1_action1 period1_action2 period1_action3 period1_action4 period2_action1 etc


with counters ( <doing counts using windowing functions>),
     pivot1   ( <pivoting counters using FILTER>
              ...sum(times) filter (where action = 'action1')...
     recent_subscription (<picking latest subscription for a user>),
     titles   (<using previous cte and adding more info from info table>)

select t.user, t.id, t.subscription_starts, t.expires_at, t.title, email,
                p."action1", p."action2", p."action3 ", p."action4"
      from titles t
      join pivot1 p

但现在的挑战是再次围绕这一点,以获得12个周期/4个动作组合。 如果按以下步骤操作,则可能仅为12 使用json_object_aggr作为句点:然后执行4个操作

 --using the piece above as another CTE called merged 
 --this code does not work :(
        email, id, ends, subs, info, 
        json_object_aggr(starts, s1,v1,p1,d1 ORDER BY starts) as P1,
        json_object_aggr(starts, s2,v2,p2,d2 ORDER BY starts) as P2,
        json_object_aggr(starts, s3,v3,p3,d3 ORDER BY starts) as P3,
        json_object_aggr(starts, s4,v4,p4,d4 ORDER BY starts) as P4,
        json_object_aggr(starts, s5,v5,p5,d5 ORDER BY starts) as P5,
        json_object_aggr(starts, s6,v6,p6,d6 ORDER BY starts) as P6,
        json_object_aggr(starts, s7,v7,p7,d7 ORDER BY starts) as P7,
        json_object_aggr(starts, s8,v8,p8,d8 ORDER BY starts) as P8,
        json_object_aggr(starts, s9,v9,p9,d9 ORDER BY starts) as P9,
        json_object_aggr(starts, s10,v10,p10,d10 ORDER BY starts) as P10,
        json_object_aggr(starts, s11,v11,p11,d11 ORDER BY starts) as P11,
        json_object_aggr(starts, s12,v12,p12,d12 ORDER BY starts) as P12
            (select email, id,  starts, ends, subs, info, starts, 
  sum("action1") as s1,sum("action2") as v1,sum("action3") as 
  as d1 
              from merged
              group by email, id,  starts, ends, subs, info, starts

            ) m
    group by email, id,  starts, ends, subs, info
    order by email, id,  starts, ends, subs, info 
这是否可以是json_object_agg,每个周期执行4个操作? 我能得到关于如何旋转这个的帮助吗



WITH subs AS (
      SELECT s.user_id, u.email, MAX(s.sub_date) AS recent_sub_date 
      FROM subscriptions s 
      JOIN users u ON s.userid = u.user_id
      GROUP BY s.user_id, u.email
SELECT s.user_id,
       SUM(CASE WHEN f.action = 'action1' AND f.time <= s.recent_sub_date + INTERVAL '1 month' THEN 1 ELSE 0 END) AS period1_action1,
       SUM(CASE WHEN f.action = 'action2' AND f.time <= s.recent_sub_date + INTERVAL '1 month' THEN 1 ELSE 0 END) AS period1_action2,
       SUM(CASE WHEN f.action = 'action3' AND f.time <= s.recent_sub_date + INTERVAL '1 month' THEN 1 ELSE 0 END) AS period1_action3,
       SUM(CASE WHEN f.action = 'action4' AND f.time <= s.recent_sub_date + INTERVAL '1 month' THEN 1 ELSE 0 END) AS period1_action4,
       SUM(CASE WHEN f.action = 'action1' AND f.time <= s.recent_sub_date + INTERVAL '2 months' THEN 1 ELSE 0 END) AS period2_action1,
       SUM(CASE WHEN f.action = 'action2' AND f.time <= s.recent_sub_date + INTERVAL '2 months' THEN 1 ELSE 0 END) AS period2_action2,
       SUM(CASE WHEN f.action = 'action3' AND f.time <= s.recent_sub_date + INTERVAL '2 months' THEN 1 ELSE 0 END) AS period2_action3,
       SUM(CASE WHEN f.action = 'action4' AND f.time <= s.recent_sub_date + INTERVAL '2 months' THEN 1 ELSE 0 END) AS period2_action4,
FROM flightlog f
JOIN subs s ON s.user_id = f.user_id 
WHERE f.time > s.recent_sub_date
AND f.time <= DATE_TRUNC('month', s.recent_sub_date + INTERVAL '13 months') -- end of the 12 months after sub
GROUP BY s.user_id;



user1 ... 1st_period_4user1 action1 action2 action3 action4
user1 ... 2nd_period_4user1 action1 action2 action3 action4
我将最后5列连接到句点+4操作字符串中。 然后我对这个表进行了排名,这样会有1到12个排名。 然后使用连接的列作为值,列作为列,我再次透视。。。 这样做12次:

array_agg(stf) FILTER (where rnk = 1) AS period_1
array_agg(stf) FILTER (where rnk = 2) AS period_2
获取具有12个数据列的用户id。 这种方法的缺点是,每月跳过一次,它仍然会被列为下一个期间


这需要67行代码和4个CTE。。。 仍然希望有一个更完善的解决方案


array_agg(stf) FILTER (where rnk = 1) AS period_1
array_agg(stf) FILTER (where rnk = 2) AS period_2