Google bigquery 如何使用动态分隔符拆分字段
我在BigQuery中有一个包含联系人电子邮件的表格Google bigquery 如何使用动态分隔符拆分字段,google-bigquery,Google Bigquery,我在BigQuery中有一个包含联系人电子邮件的表格 name_family@company.com name-family@company.com name.family@company.com 我需要将名称和族提取到单独的列中。 我写了这段SQL代码,但正在寻找其他/更好的方法 WITH emailWithUnserscore AS (SELECT *, SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(
name_family@company.com
name-family@company.com
name.family@company.com
我需要将名称和族提取到单独的列中。
我写了这段SQL代码,但正在寻找其他/更好的方法
WITH emailWithUnserscore AS
(SELECT *,
SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'_')[SAFE_OFFSET(0)] AS firstName,
SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'_')[SAFE_OFFSET(1)] AS lasttName
FROM `project.dataset.contacts`
WHERE LENGTH(SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'_')[SAFE_OFFSET(1)]) > 0 ),
emailWithMinus AS
(SELECT *,
SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'-')[SAFE_OFFSET(0)] AS firstName,
SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'-')[SAFE_OFFSET(1)] AS lasttName
FROM `project.dataset.contacts`
WHERE LENGTH(SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'-')[SAFE_OFFSET(1)]) > 0 ),
emailWithDot AS
(SELECT *,
SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'.')[SAFE_OFFSET(0)] AS firstName,
SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'.')[SAFE_OFFSET(1)] AS lasttName
FROM `project.dataset.contacts`
WHERE LENGTH(SPLIT(SPLIT(string_field_0, '@')[SAFE_OFFSET(0)],'.')[SAFE_OFFSET(1)]) > 0 ),
allEmails AS
(SELECT *,
SPLIT(string_field_0, '@')[SAFE_OFFSET(0)] AS firstName,
'' AS lasttName
FROM `project.dataset.contacts`)
SELECT allEmails.string_field_0 AS Email,
if(LENGTH(emailWithUnserscore.lasttName) > 0, emailWithUnserscore.firstName, if(LENGTH(emailWithMinus.lasttName) > 0, emailWithMinus.firstName, if(LENGTH(emailWithDot.lasttName) > 0, emailWithDot.firstName, allEmails.firstName))) AS firstName,
if(LENGTH(emailWithUnserscore.lasttName) > 0, emailWithUnserscore.lasttName, if(LENGTH(emailWithMinus.lasttName) > 0, emailWithMinus.lasttName, if(LENGTH(emailWithDot.lasttName) > 0, emailWithDot.lasttName, allEmails.lasttName))) AS lastName
FROM allEmails
LEFT JOIN emailWithUnserscore ON allEmails.string_field_0 = emailWithUnserscore.string_field_0
LEFT JOIN emailWithMinus ON allEmails.string_field_0 = emailWithMinus.string_field_0
LEFT JOIN emailWithDot ON allEmails.string_field_0 = emailWithDot.string_field_0
ORDER BY Email DES
结果
Row email firstName lastName
1 name_family@company.com name family
2 name-family@company.com name family
3 name.family@company.com name family
不清楚,但似乎您正在试图从电子邮件地址中提取姓名?只是想知道当你没有得到分隔符或比预期的更多时,你会期望什么bob741@email.com,第一,中间。last@blah.com,鲍勃,双人-barrelled@...Yes,我需要将名称和族提取到单独的列中,您的观点非常好。我希望得到一个良好稳定的SQL起点,并在其他用例的基础上进行构建,改进其格式,或帮助人们理解您的问题,并帮助您获得适当的答案。但是你可能还需要添加更多的信息才能完全解决你的问题。很高兴它对你有用。请考虑投票表决:O)
Row email firstName lastName
1 name_family@company.com name family
2 name-family@company.com name family
3 name.family@company.com name family