Google bigquery 在Google Bigquery中运行不同组的语法

Google bigquery 在Google Bigquery中运行不同组的语法,google-bigquery,Google Bigquery,我有一个疑问: SELECT campaign.id AS campaign_id, GROUP_CONCAT(utm.campaign) AS utm_campaign FROM [email_event] WHERE (TIMESTAMP BETWEEN SEC_TO_TIMESTAMP(1412136000) AND SEC_TO_TIMESTAMP(1414814340)) GROUP BY campaign_id; 我很想运行一个不同的GROUP_CONCAT,因为现在

我有一个疑问:

SELECT campaign.id AS campaign_id,
       GROUP_CONCAT(utm.campaign) AS utm_campaign
FROM [email_event]
WHERE (TIMESTAMP BETWEEN SEC_TO_TIMESTAMP(1412136000) AND SEC_TO_TIMESTAMP(1414814340))
GROUP BY campaign_id;
我很想运行一个不同的GROUP_CONCAT,因为现在输出中重复了相同的条目

更新

我已将您的解决方案扩展到:

SELECT campaign.id AS campaign_id,
       GROUP_CONCAT(utm.campaign) AS utm_campaign,
       GROUP_CONCAT(utm.content) AS utm_content
FROM
  (SELECT *
   FROM
     (SELECT 507 AS campaign.id,
             'remarketingemail' AS utm.campaign,
             'newsletter_feb' AS utm.content),
     (SELECT 508 AS campaign.id,
             'remarketingemail' AS utm.campaign,
             'newsletter_jan' AS utm.content),
     (SELECT 508 AS campaign.id,
             'remarketingemail' AS utm.campaign,
             'newsletter_feb' AS utm.content),
     (SELECT 508 AS campaign.id,
             'adwordscamp' AS utm.campaign,
             'cyber_monday' AS utm.content) )
GROUP BY campaign_id;
但现在我得到了utm_运动的重复值

+-----+------------------------------------------+--------------------------------------+
| 507 | remarketingemail                         | newsletter_feb                       |
| 508 | remarketingemail,remarketingemail,adw... | newsletter_jan,newsletter_feb,cyb... |
+-----+------------------------------------------+--------------------------------------+
这是在qroup by之前子查询上的原始输出

+-----+-----------------------------------+-------------------------------+
| 507 | remarketingemail                  | newsletter_feb                |
| 508 | remarketingemail                  | newsletter_jan                |
| 508 | remarketingemail                  | newsletter_feb                |
| 508 | adwordscamp                       | cyber_monday                  |
+-----+-----------------------------------+-------------------------------+

使用子查询按分组并获取不同的值。大概是这样的:

SELECT campaign.id AS campaign_id,
       GROUP_CONCAT(utm.campaign) AS utm_campaign
FROM
    (Select campaign.id,utm.campaign
    FROM [email_event]
    WHERE (TIMESTAMP BETWEEN SEC_TO_TIMESTAMP(1412136000) AND SEC_TO_TIMESTAMP(1414814340))
    GROUP EACH BY campaign.id,utm.campaign)
    GROUP BY campaign_id;
少数聚合字段的另一个选择是分阶段执行

SELECT campaign_id ,
       GROUP_CONCAT(utm_campaign) as utm_campaign,
       utm_content
       From
(SELECT campaign.id AS campaign_id,
       utm.campaign as utm_campaign,
       GROUP_CONCAT(utm.content) AS utm_content
FROM
    (
SELECT *
FROM
  ( SELECT 507 AS campaign.id,
           'remarketingemail' AS utm.campaign,
           'newsletter_feb' AS utm.content),
  ( SELECT 507 AS campaign.id,
           'remarketingemail2' AS utm.campaign,
           'newsletter_feb' AS utm.content),
  (SELECT 508 AS campaign.id,
          'remarketingemail' AS utm.campaign,
          'newsletter_jan' AS utm.content),
  (SELECT 508 AS campaign.id,
          'remarketingemail' AS utm.campaign,
          'newsletter_feb' AS utm.content)
      )
    GROUP BY utm_campaign,campaign_id)
    GROUP BY utm_content,campaign_id
    ;

请参阅我对问题的更新,当有两个组时,distinct不起作用。您也可以共享子查询的结果吗?@N.N.子查询的结果是您可以从我的上表中扣除的3行。问题是“remarketingemail”在group_concat中被复制,并且不知道如何获得唯一性。您可以共享源数据吗?从您的问题中很难理解每个字段的基数。@N.N.更新了问题如果utm.campaign到campaign\u id具有1:1或m:1关系,您可以使用不同的聚合函数,如MAX()。