Google bigquery 在Google Bigquery中运行不同组的语法
我有一个疑问:Google bigquery 在Google Bigquery中运行不同组的语法,google-bigquery,Google Bigquery,我有一个疑问: SELECT campaign.id AS campaign_id, GROUP_CONCAT(utm.campaign) AS utm_campaign FROM [email_event] WHERE (TIMESTAMP BETWEEN SEC_TO_TIMESTAMP(1412136000) AND SEC_TO_TIMESTAMP(1414814340)) GROUP BY campaign_id; 我很想运行一个不同的GROUP_CONCAT,因为现在
SELECT campaign.id AS campaign_id,
GROUP_CONCAT(utm.campaign) AS utm_campaign
FROM [email_event]
WHERE (TIMESTAMP BETWEEN SEC_TO_TIMESTAMP(1412136000) AND SEC_TO_TIMESTAMP(1414814340))
GROUP BY campaign_id;
我很想运行一个不同的GROUP_CONCAT,因为现在输出中重复了相同的条目
更新
我已将您的解决方案扩展到:
SELECT campaign.id AS campaign_id,
GROUP_CONCAT(utm.campaign) AS utm_campaign,
GROUP_CONCAT(utm.content) AS utm_content
FROM
(SELECT *
FROM
(SELECT 507 AS campaign.id,
'remarketingemail' AS utm.campaign,
'newsletter_feb' AS utm.content),
(SELECT 508 AS campaign.id,
'remarketingemail' AS utm.campaign,
'newsletter_jan' AS utm.content),
(SELECT 508 AS campaign.id,
'remarketingemail' AS utm.campaign,
'newsletter_feb' AS utm.content),
(SELECT 508 AS campaign.id,
'adwordscamp' AS utm.campaign,
'cyber_monday' AS utm.content) )
GROUP BY campaign_id;
但现在我得到了utm_运动的重复值
+-----+------------------------------------------+--------------------------------------+
| 507 | remarketingemail | newsletter_feb |
| 508 | remarketingemail,remarketingemail,adw... | newsletter_jan,newsletter_feb,cyb... |
+-----+------------------------------------------+--------------------------------------+
这是在qroup by之前子查询上的原始输出
+-----+-----------------------------------+-------------------------------+
| 507 | remarketingemail | newsletter_feb |
| 508 | remarketingemail | newsletter_jan |
| 508 | remarketingemail | newsletter_feb |
| 508 | adwordscamp | cyber_monday |
+-----+-----------------------------------+-------------------------------+
使用子查询按分组并获取不同的值。大概是这样的:
SELECT campaign.id AS campaign_id,
GROUP_CONCAT(utm.campaign) AS utm_campaign
FROM
(Select campaign.id,utm.campaign
FROM [email_event]
WHERE (TIMESTAMP BETWEEN SEC_TO_TIMESTAMP(1412136000) AND SEC_TO_TIMESTAMP(1414814340))
GROUP EACH BY campaign.id,utm.campaign)
GROUP BY campaign_id;
少数聚合字段的另一个选择是分阶段执行
SELECT campaign_id ,
GROUP_CONCAT(utm_campaign) as utm_campaign,
utm_content
From
(SELECT campaign.id AS campaign_id,
utm.campaign as utm_campaign,
GROUP_CONCAT(utm.content) AS utm_content
FROM
(
SELECT *
FROM
( SELECT 507 AS campaign.id,
'remarketingemail' AS utm.campaign,
'newsletter_feb' AS utm.content),
( SELECT 507 AS campaign.id,
'remarketingemail2' AS utm.campaign,
'newsletter_feb' AS utm.content),
(SELECT 508 AS campaign.id,
'remarketingemail' AS utm.campaign,
'newsletter_jan' AS utm.content),
(SELECT 508 AS campaign.id,
'remarketingemail' AS utm.campaign,
'newsletter_feb' AS utm.content)
)
GROUP BY utm_campaign,campaign_id)
GROUP BY utm_content,campaign_id
;
请参阅我对问题的更新,当有两个组时,distinct不起作用。您也可以共享子查询的结果吗?@N.N.子查询的结果是您可以从我的上表中扣除的3行。问题是“remarketingemail”在group_concat中被复制,并且不知道如何获得唯一性。您可以共享源数据吗?从您的问题中很难理解每个字段的基数。@N.N.更新了问题如果utm.campaign到campaign\u id具有1:1或m:1关系,您可以使用不同的聚合函数,如MAX()。