Sql 大数据条件整合
我有一个很快的问题,我正试图用我的头来反驳,但没有成功: 假设我有下表:Sql 大数据条件整合,sql,string,group-by,google-bigquery,Sql,String,Group By,Google Bigquery,我有一个很快的问题,我正试图用我的头来反驳,但没有成功: 假设我有下表: +-----+----------------+-----+-------+------+------+------+-------+ | Row | Promotion Name | Day | Month | Year | SENT | Open | Click | +-----+----------------+-----+-------+------+------+------+-------+ | 1 | E
+-----+----------------+-----+-------+------+------+------+-------+
| Row | Promotion Name | Day | Month | Year | SENT | Open | Click |
+-----+----------------+-----+-------+------+------+------+-------+
| 1 | Email_New_V1 | 1 | 2 | 2019 | 3 | 2 | 1 |
| 2 | Email_New_V2 | 1 | 2 | 2019 | 5 | 2 | 1 |
| 3 | Email_New_V3 | 2 | 2 | 2019 | 4 | 2 | 1 |
+-----+----------------+-----+-------+------+------+------+-------+
基本上,我想要每天(第1天、第2天等)和每月(第1个月、第2个月等)的发送、打开和单击的总数…通过促销名称(电子邮件\新%)中的前几个字符进行聚合
基本上,我会这样:
+----------------+-----+-------+------+------+------+-------+
| Promotion Name | Day | Month | Year | SENT | Open | Click |
+----------------+-----+-------+------+------+------+-------+
| Email_New | 1 | 2 | 2019 | 12 | 6 | 3 |
+----------------+-----+-------+------+------+------+-------+
我尝试使用SUBSTR来选择前几个单词,但没有成功。能给我小费吗
多谢各位 如果只想删除最后3个字符并按结果字符串分组:
select
substring(promotion_name, -3) promotion_name,
day,
month,
year,
sum(sent) sent,
sum(open) open,
sum(click) click
from mytable
group by
substring(promotion_name, -3),
day,
month,
year
如果字符数可变,并且希望删除最后一个下划线(包括在内)之后的所有内容,则regexp函数可能很方便:
select
regexp_replace(promotion_name, '_[^_]+$', '') promotion_name,
day,
month,
year,
sum(sent) sent,
sum(open) open,
sum(click) click
from mytable
group by
regexp_replace(promotion_name, '_[^_]+$', ''),
day,
month,
year
“\u[^\ u]+$”
的意思是:一个下划线,后跟除下划线以外的至少一个字符,然后是字符串的结尾。下面是BigQuery标准SQL的示例
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'Email_New_V1' promotion_name, 1 day, 2 month, 2019 year, 3 sent, 2 open, 1 click UNION ALL
SELECT 'Email_New_V2', 1, 2, 2019, 5, 2, 1 UNION ALL
SELECT 'Email_New_V3', 2, 2, 2019, 4, 2, 1 UNION ALL
SELECT 'Email_Old_V1', 1, 2, 2019, 1, 2, 3 UNION ALL
SELECT 'Email_Old_V2', 1, 2, 2019, 4, 5, 6
), promotions AS (
SELECT 'Email_New' promotion_name UNION ALL
SELECT 'Email_Old'
)
SELECT p.promotion_name,
day, month, year,
SUM(sent) sent,
SUM(open) open,
SUM(click) click
FROM `project.dataset.table` t
JOIN promotions p
ON STARTS_WITH(t.promotion_name, p.promotion_name)
GROUP BY promotion_name, day, month, year
有输出
Row promotion_name day month year sent open click
1 Email_New 1 2 2019 8 4 2
2 Email_New 2 2 2019 4 2 1
3 Email_Old 1 2 2019 5 7 9
你如何定义“最初的几个词”?还有哪些促销活动?为什么您的查询会合并来自两个不同日期的数据?您的查询率很低。重要提示-您可以使用投递答案左侧投票下方的勾号
标记接受答案
。看看为什么它很重要!对答案进行投票也很重要。投票选出有帮助的答案。。。当有人回答你的问题时,你可以检查一下该做什么。遵循这些简单的规则,你可以提高自己的声望得分,同时让我们有动力来回答你的问题:O)请考虑!