Google bigquery 如果记录重复过多,则进行大查询(带展平)
这是关于以下问题的解决方案:我尝试创建了一个测试表并尝试了给定的查询,但它实际上并没有选择纽约和芝加哥的居民。测试数据如下:Google bigquery 如果记录重复过多,则进行大查询(带展平),google-bigquery,google-cloud-platform,Google Bigquery,Google Cloud Platform,这是关于以下问题的解决方案:我尝试创建了一个测试表并尝试了给定的查询,但它实际上并没有选择纽约和芝加哥的居民。测试数据如下: {"fullname": "John Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "seattle"}]} {"fullname": "Adam Smith", "citiesLived": [{"place": "newyork"}, {"place": "c
{"fullname": "John Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "seattle"}]}
{"fullname": "Adam Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "phil"}]}
{"fullname": "Adam Jefferson", "citiesLived": [{"place": "boston"}, {"place": "chicago"}, {"place": "seattle"}]}
SELECT
*
FROM (
SELECT
fullname,
IF (citiesLived.place == 'newyork', 1, 0) AS ny,
IF (citiesLived.place == 'chicago', 1, 0) AS chi
FROM (FLATTEN(tester.citiesLived, citiesLived))
OMIT
RECORD IF citiesLived.place = 'seattle')
WHERE
ny == 1
AND chi == 1
查询如下:
{"fullname": "John Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "seattle"}]}
{"fullname": "Adam Smith", "citiesLived": [{"place": "newyork"}, {"place": "chicago"}, {"place": "phil"}]}
{"fullname": "Adam Jefferson", "citiesLived": [{"place": "boston"}, {"place": "chicago"}, {"place": "seattle"}]}
SELECT
*
FROM (
SELECT
fullname,
IF (citiesLived.place == 'newyork', 1, 0) AS ny,
IF (citiesLived.place == 'chicago', 1, 0) AS chi
FROM (FLATTEN(tester.citiesLived, citiesLived))
OMIT
RECORD IF citiesLived.place = 'seattle')
WHERE
ny == 1
AND chi == 1
您不需要进行展平(一般来说,在BigQuery查询中很少需要展平),只要忽略以下内容即可:
SELECT fullname FROM tester.citiesLived
OMIT RECORD IF NOT (
SOME(citiesLived.place = "newyork") AND
SOME(citiesLived.place = "chicago"))
省略IF的条件表示,如果一些居住的城市是纽约,一些是芝加哥,那么它符合您的标准。但是如果两者都不正确,则应省略记录(因此not谓词)。我认为这将是对原始预期查询的更完整重写:
SELECT
*
FROM (
SELECT
fullname,
SOME(citiesLived.place == 'newyork') WITHIN RECORD AS ny,
SOME(citiesLived.place == 'chicago') WITHIN RECORD AS chi
FROM tester.citiesLived
OMIT
RECORD IF SOME(citiesLived.place = 'seattle'))
WHERE
ny == true
AND chi == true