在presto中将行展开为列

在presto中将行展开为列,presto,Presto,有什么方法可以有效地在presto中将行扩展到列 我试着分别用“where team=1”和“where team=2”过滤原始数据集,首先得到相应的数据集1和数据集2,然后在收入水平上连接这两个数据集。然而,当收入水平有太多不同的值时,这是不方便的。有没有什么有效的方法可以得到我想要的结果 Prestodb提供了一个map_agg函数,可以帮助您将长数据转换为所需的宽格式。不幸的是,似乎没有一种动态创建列名的方法,但是这种方法应该比加入每个团队更高效,更少键入 WITH raw_data AS

有什么方法可以有效地在presto中将行扩展到列

我试着分别用“where team=1”和“where team=2”过滤原始数据集,首先得到相应的数据集1和数据集2,然后在收入水平上连接这两个数据集。然而,当收入水平有太多不同的值时,这是不方便的。有没有什么有效的方法可以得到我想要的结果

Prestodb提供了一个map_agg函数,可以帮助您将长数据转换为所需的宽格式。不幸的是,似乎没有一种动态创建列名的方法,但是这种方法应该比加入每个团队更高效,更少键入

WITH raw_data AS (
  SELECT 1 AS team, 'a' AS income_level, 1 AS time, 11 AS ord
  UNION
  SELECT 1 AS team, 'b' AS income_level, 2 AS time, 12 AS ord
  UNION
  SELECT 1 AS team, 'c' AS income_level, 3 AS time, 13 AS ord
  UNION
  SELECT 2 AS team, 'a' AS income_level, 4 AS time, 14 AS ord
  UNION
  SELECT 2 AS team, 'b' AS income_level, 5 AS time, 15 AS ord
  UNION
  SELECT 2 AS team, 'c' AS income_level, 6 AS time, 16 AS ord
  UNION
  SELECT 3 AS team, 'a' AS income_level, 7 AS time, 17 AS ord
  UNION
  SELECT 3 AS team, 'b' AS income_level, 8 AS time, 18 AS ord
  UNION
  SELECT 3 AS team, 'c' AS income_level, 9 AS time, 19 AS ord
)

SELECT
  income_level,
  team_time[1] AS time_1,
  team_ord[1] AS ord_1,
  team_time[2] AS time_2,
  team_ord[2] AS ord_2,
  team_time[3] AS time_3,
  team_ord[3] AS ord_3
FROM (
  SELECT
    income_level,
    map_agg(team, time) AS team_time,
    map_agg(team, ord) AS team_ord
  FROM raw_data
  GROUP BY income_level
);
输出:

| income_level | time_1 | ord_1 | time_2 | ord_2 | time_3 | ord_3 |
|--------------|--------|-------|--------|-------|--------|-------|
| a            | 1      | 11    | 4      | 14    | 7      | 17    |
| b            | 2      | 12    | 5      | 15    | 8      | 18    |
| c            | 3      | 13    | 6      | 16    | 9      | 19    |

提供了如何执行此操作的另一个示例。

。谢谢。事实上,我上面发布的原始数据集是一个简单的版本。在完整的数据集中,收入水平在所有团队中均有体现,但有8个团队的值为[0,1,2,10,11,12,21,22]。我提出的解决方案是创建第一个表,其中team=0,然后在收入水平上连接第二个表,其中team=1。重复这些步骤,直到团队=22。“这种方式似乎效率较低。”郭应征感谢您的澄清。我更新了我的解决方案以利用map_agg函数,当有两个以上的团队时,该函数在数据透视方面做得更好。