Google bigquery 如何使用动态分隔符拆分字段

Google bigquery 如何使用动态分隔符拆分字段,google-bigquery,Google Bigquery,我在BigQuery中有一个包含联系人电子邮件的表格 name_family@company.com name-family@company.com name.family@company.com 我需要将名称和族提取到单独的列中。 我写了这段SQL代码,但正在寻找其他/更好的方法 WITH emailWithUnserscore AS (SELECT *, SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(

我在BigQuery中有一个包含联系人电子邮件的表格

name_family@company.com

name-family@company.com

name.family@company.com
我需要将名称和族提取到单独的列中。 我写了这段SQL代码,但正在寻找其他/更好的方法


WITH emailWithUnserscore AS
      (SELECT *,
              SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'_')[SAFE_OFFSET(0)] AS firstName,
              SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'_')[SAFE_OFFSET(1)] AS lasttName
       FROM `project.dataset.contacts`
       WHERE LENGTH(SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'_')[SAFE_OFFSET(1)]) > 0 ),
         emailWithMinus AS
      (SELECT *,
              SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'-')[SAFE_OFFSET(0)] AS firstName,
              SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'-')[SAFE_OFFSET(1)] AS lasttName
       FROM `project.dataset.contacts`
       WHERE LENGTH(SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'-')[SAFE_OFFSET(1)]) > 0 ),
         emailWithDot AS
      (SELECT *,
              SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'.')[SAFE_OFFSET(0)] AS firstName,
              SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'.')[SAFE_OFFSET(1)] AS lasttName
       FROM `project.dataset.contacts`
       WHERE LENGTH(SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'.')[SAFE_OFFSET(1)]) > 0 ),
         allEmails AS
      (SELECT *,
              SPLIT(string_field_0, '@')[SAFE_OFFSET(0)] AS firstName,
              '' AS lasttName
       FROM `project.dataset.contacts`)
    SELECT allEmails.string_field_0 AS Email,
           if(LENGTH(emailWithUnserscore.lasttName) > 0, emailWithUnserscore.firstName, if(LENGTH(emailWithMinus.lasttName) > 0, emailWithMinus.firstName, if(LENGTH(emailWithDot.lasttName) > 0, emailWithDot.firstName, allEmails.firstName))) AS firstName,
           if(LENGTH(emailWithUnserscore.lasttName) > 0, emailWithUnserscore.lasttName, if(LENGTH(emailWithMinus.lasttName) > 0, emailWithMinus.lasttName, if(LENGTH(emailWithDot.lasttName) > 0, emailWithDot.lasttName, allEmails.lasttName))) AS lastName
    FROM allEmails
    LEFT JOIN emailWithUnserscore ON allEmails.string_field_0 = emailWithUnserscore.string_field_0
    LEFT JOIN emailWithMinus ON allEmails.string_field_0 = emailWithMinus.string_field_0
    LEFT JOIN emailWithDot ON allEmails.string_field_0 = emailWithDot.string_field_0
    ORDER BY Email DES
结果

Row     email                       firstName   lastName     
1       name_family@company.com     name        family   
2       name-family@company.com     name        family   
3       name.family@company.com     name        family   

不清楚,但似乎您正在试图从电子邮件地址中提取姓名?只是想知道当你没有得到分隔符或比预期的更多时,你会期望什么bob741@email.com,第一,中间。last@blah.com,鲍勃,双人-barrelled@...Yes,我需要将名称和族提取到单独的列中,您的观点非常好。我希望得到一个良好稳定的SQL起点,并在其他用例的基础上进行构建,改进其格式,或帮助人们理解您的问题,并帮助您获得适当的答案。但是你可能还需要添加更多的信息才能完全解决你的问题。很高兴它对你有用。请考虑投票表决:O)
Row     email                       firstName   lastName     
1       name_family@company.com     name        family   
2       name-family@company.com     name        family   
3       name.family@company.com     name        family