Google bigquery 带数组的不同键\u CONCAT带Struct<;字符串,字符串>;

Google bigquery 带数组的不同键\u CONCAT带Struct<;字符串,字符串>;,google-bigquery,Google Bigquery,我需要对下表中的2个数组字段执行以下操作。数组的类型为Struct 将数组合并在一起 如果labels.key和project.key之间存在重复的键,那么我只想保留labels字段中的kvp 将组合的数组展平为带分隔符的字符串,并对它们进行排序(这样我就可以按顺序分组) 示例表数据 SELECT 1 as id, ARRAY [STRUCT("testlabel2" as key, "thisvalueisbetter" as value), STRUCT("testlabel3", "t

我需要对下表中的2个数组字段执行以下操作。数组的类型为
Struct

  • 将数组合并在一起
  • 如果labels.key和project.key之间存在重复的键,那么我只想保留labels字段中的kvp
  • 将组合的数组展平为带分隔符的字符串,并对它们进行排序(这样我就可以按顺序分组)
  • 示例表数据

    SELECT 1 as id, ARRAY
      [STRUCT("testlabel2" as key, "thisvalueisbetter" as value), STRUCT("testlabel3", "testvalue3")] as labels, 
      [STRUCT("testlabel2" as key, "testvalue2" as value)] as project
    
    下面的查询执行除#2之外的所有操作,我不确定如何实现这一点。有人对如何做到这一点有什么建议吗

    SELECT
      id,
      (SELECT STRING_AGG(DISTINCT CONCAT(l.key, ':', l.value) ORDER BY CONCAT(l.key, ':', l.value))
        FROM UNNEST(
        ARRAY_CONCAT(labels, project)) AS l) AS label,
    FROM `mytestdata` AS t
    GROUP BY id, label
    
    当前,此查询提供以下输出:

    1 testlabel2:testvalue2,testlabel2:thisvalueisbetter,testlabel3:testvalue3
    
    但我在寻找:

    1 testlabel2:thisvalueisbetter,testlabel3:testvalue3
    

    下面是BigQuery标准SQL

    #standardSQL
    SELECT *, 
      ARRAY(
        SELECT AS STRUCT key, ARRAY_AGG(value ORDER BY source LIMIT 1)[OFFSET(0)] AS value
        FROM ( 
          SELECT 0 AS source, * FROM t.labels UNION ALL
          SELECT 1, * FROM t.project 
        ) 
        GROUP BY key
      ) AS combined_array
    FROM `project.dataset.table` t  
    
    您可以使用问题中的样本数据测试、播放上述内容,如下例所示

    #standardSQL
    WITH `project.dataset.table` AS (
    SELECT ARRAY
      [STRUCT("testlabel2" AS key, "thisvalueisbetter" AS value), STRUCT("testlabel3", "testvalue3")] AS labels, 
      [STRUCT("testlabel2" AS key, "testvalue2" AS value)] AS project
    )
    SELECT *, 
      ARRAY(
        SELECT AS STRUCT key, ARRAY_AGG(value ORDER BY source LIMIT 1)[OFFSET(0)] AS value
        FROM ( 
          SELECT 0 AS source, * FROM t.labels UNION ALL
          SELECT 1, * FROM t.project 
        ) 
        GROUP BY key
      ) AS combined_array
    FROM `project.dataset.table` t  
    
    结果

    或者。。。要完全匹配您的预期输出,请使用下面的

    #standardSQL
    SELECT *, 
      (SELECT STRING_AGG(x) FROM (
        SELECT CONCAT(key, ':', ARRAY_AGG(value ORDER BY source LIMIT 1)[OFFSET(0)]) x
        FROM ( 
          SELECT 0 AS source, * FROM t.labels UNION ALL
          SELECT 1, * FROM t.project 
        ) 
        GROUP BY key
      )) AS combined_result
    FROM `project.dataset.table` t   
    
    结果


    您的结构是数组吗?你能提供维特的例子来重现问题吗?对不起,我现在意识到我的例子数据不是很清楚。我已经更新了帖子,添加了一个查询,我们创建了一个数据示例。有两个数组(标签和项目),都是Struct类型。我将致力于将其组合在一起,以生成带有查询的示例数据。。。对不起,我对BigQuery的了解有限,所以可能需要一些时间。太棒了。非常感谢你!我想我需要一点时间来理解它是如何工作的,但这正是我想要的:)