Google bigquery 如何查询要嵌套在带有后缀datetime的记录列中的所有记录
我有一个带有字段Google bigquery 如何查询要嵌套在带有后缀datetime的记录列中的所有记录,google-bigquery,Google Bigquery,我有一个带有字段标记日期的表(类型:RECORD) 标记日期具有以下属性: 我不知道designer是如何创建此表的,但数据是这样保存的: 我必须用一些datetime过滤器对所有挣到的积分求和 我必须把所有挣来的钱加起来 下面是BigQuery标准SQL #standardSQL SELECT userID, SUM(markedDates.d_2018_11_30.earnedPoint) AS allEarnedPoint FROM `project.dataset.table` GRO
标记日期的表(类型:RECORD)
标记日期
具有以下属性:
我不知道designer是如何创建此表的,但数据是这样保存的:
我必须用一些datetime过滤器对所有挣到的积分求和
我必须把所有挣来的钱加起来
下面是BigQuery标准SQL
#standardSQL
SELECT userID, SUM(markedDates.d_2018_11_30.earnedPoint) AS allEarnedPoint
FROM `project.dataset.table`
GROUP BY userID
…带有一些日期时间过滤器
看不到任何可用于此类筛选的日期时间相关字段
问题是在标记日期中还有其他日期时间:d_2018_09_08,d_2019_09_09,…我还必须在其他日期合计已赢得的积分
下面就是诀窍
#standardSQL
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID
您可以使用过于简化的虚拟数据来测试、处理上述内容,我希望这些数据能够代表您的情况
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 userID,
STRUCT(
STRUCT(0 AS earnedPoint, TRUE AS earnedShare) AS d_2018_11_30,
STRUCT(1 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_01,
STRUCT(2 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_02,
STRUCT(3 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_03,
STRUCT(4 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_04,
STRUCT(5 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_05,
STRUCT(6 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_06
) markedDates
)
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID
注意:你们的数据应该按原样工作——但即使你们需要做一些调整——你们也应该从上面得到好主意
我必须把所有挣来的钱加起来
下面是BigQuery标准SQL
#standardSQL
SELECT userID, SUM(markedDates.d_2018_11_30.earnedPoint) AS allEarnedPoint
FROM `project.dataset.table`
GROUP BY userID
…带有一些日期时间过滤器
看不到任何可用于此类筛选的日期时间相关字段
问题是在标记日期中还有其他日期时间:d_2018_09_08,d_2019_09_09,…我还必须在其他日期合计已赢得的积分
下面就是诀窍
#standardSQL
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID
您可以使用过于简化的虚拟数据来测试、处理上述内容,我希望这些数据能够代表您的情况
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 userID,
STRUCT(
STRUCT(0 AS earnedPoint, TRUE AS earnedShare) AS d_2018_11_30,
STRUCT(1 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_01,
STRUCT(2 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_02,
STRUCT(3 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_03,
STRUCT(4 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_04,
STRUCT(5 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_05,
STRUCT(6 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_06
) markedDates
)
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`,
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID
注意:数据应按原样处理-但即使您需要进行一些调整-您也应从上面获得好主意预期输出是什么?在该表中有一个字段userID
。预期输出是使用datetime筛选器的用户所有挣点的总和。(>=2019/09/10,哪个字段用于datetime筛选器?我想我们必须使用markedDates.d_2018_11_30
。在它里面没有任何datetime字段。它是一个记录,没有任何看起来像date或datetime的内容,而不是字段本身的名称。预期输出是什么?在这个表中有一个字段userID
。预期输出put是使用datetime筛选器计算的用户所有已挣点数的总和。(>=2019/09/10,哪个字段用于datetime筛选器?我认为我们必须使用标记日期。d_2018_11_30
。在它里面没有任何datetime字段。它是一个没有任何类似日期或日期时间的内容的记录,而不是字段本身的名称。问题是在标记日期中还有其他日期时间。:d_2018_09_08,d_2019_09_09,…所以,u更新您的问题以清楚地呈现您的情况!我尝试了您的查询,但得到一个错误:错误的双精度值:{\'float\':1,\'integer\':null\“provi…
。看起来,earnedPoint
是空的。问题在于标记日期还有其他日期时间:d_2018_09_08,d_2019_09_09,…所以,更新你的问题以清楚地说明你的情况!我尝试了你的查询,但得到一个错误:错误的双精度值:{“float\”:1,\“integer\”:空\“provi…
。看起来,earnedPoint
为空。