Sql 大数据条件整合

Sql 大数据条件整合,sql,string,group-by,google-bigquery,Sql,String,Group By,Google Bigquery,我有一个很快的问题,我正试图用我的头来反驳,但没有成功: 假设我有下表: +-----+----------------+-----+-------+------+------+------+-------+ | Row | Promotion Name | Day | Month | Year | SENT | Open | Click | +-----+----------------+-----+-------+------+------+------+-------+ | 1 | E

我有一个很快的问题,我正试图用我的头来反驳,但没有成功:

假设我有下表:

+-----+----------------+-----+-------+------+------+------+-------+
| Row | Promotion Name | Day | Month | Year | SENT | Open | Click |
+-----+----------------+-----+-------+------+------+------+-------+
|   1 | Email_New_V1   |   1 |     2 | 2019 |    3 |    2 |     1 |
|   2 | Email_New_V2   |   1 |     2 | 2019 |    5 |    2 |     1 |
|   3 | Email_New_V3   |   2 |     2 | 2019 |    4 |    2 |     1 |
+-----+----------------+-----+-------+------+------+------+-------+
基本上,我想要每天(第1天、第2天等)和每月(第1个月、第2个月等)的发送、打开和单击的总数…通过促销名称(电子邮件\新%)中的前几个字符进行聚合

基本上,我会这样:

 +----------------+-----+-------+------+------+------+-------+
    | Promotion Name | Day | Month | Year | SENT | Open | Click |
    +----------------+-----+-------+------+------+------+-------+
    | Email_New      |   1 |     2 | 2019 |   12 |    6 |     3 |
    +----------------+-----+-------+------+------+------+-------+
我尝试使用SUBSTR来选择前几个单词,但没有成功。能给我小费吗


多谢各位

如果只想删除最后3个字符并按结果字符串分组:

select 
    substring(promotion_name, -3) promotion_name,
    day,
    month,
    year,
    sum(sent) sent,
    sum(open) open,
    sum(click) click
from mytable
group by
    substring(promotion_name, -3),
    day,
    month,
    year
如果字符数可变,并且希望删除最后一个下划线(包括在内)之后的所有内容,则regexp函数可能很方便:

select 
    regexp_replace(promotion_name, '_[^_]+$', '') promotion_name,
    day,
    month,
    year,
    sum(sent) sent,
    sum(open) open,
    sum(click) click
from mytable
group by
    regexp_replace(promotion_name, '_[^_]+$', ''),
    day,
    month,
    year

“\u[^\ u]+$”
的意思是:一个下划线,后跟除下划线以外的至少一个字符,然后是字符串的结尾。

下面是BigQuery标准SQL的示例

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'Email_New_V1' promotion_name, 1 day, 2 month, 2019 year, 3 sent, 2 open, 1 click UNION ALL
  SELECT 'Email_New_V2', 1, 2, 2019, 5, 2, 1 UNION ALL
  SELECT 'Email_New_V3', 2, 2, 2019, 4, 2, 1 UNION ALL
  SELECT 'Email_Old_V1', 1, 2, 2019, 1, 2, 3 UNION ALL
  SELECT 'Email_Old_V2', 1, 2, 2019, 4, 5, 6 
), promotions AS (
  SELECT 'Email_New' promotion_name UNION ALL
  SELECT 'Email_Old'
)
SELECT p.promotion_name, 
  day, month, year, 
  SUM(sent) sent,
  SUM(open) open,
  SUM(click) click 
FROM `project.dataset.table` t
JOIN promotions p
ON STARTS_WITH(t.promotion_name, p.promotion_name)
GROUP BY promotion_name, day, month, year      
有输出

Row promotion_name  day month   year    sent    open    click    
1   Email_New       1   2       2019    8       4       2    
2   Email_New       2   2       2019    4       2       1    
3   Email_Old       1   2       2019    5       7       9    

你如何定义“最初的几个词”?还有哪些促销活动?为什么您的查询会合并来自两个不同日期的数据?您的查询率很低。重要提示-您可以使用投递答案左侧投票下方的勾号
标记接受答案
。看看为什么它很重要!对答案进行投票也很重要。投票选出有帮助的答案。。。当有人回答你的问题时,你可以检查一下该做什么。遵循这些简单的规则,你可以提高自己的声望得分,同时让我们有动力来回答你的问题:O)请考虑!