Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/kotlin/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Google bigquery 如何查询要嵌套在带有后缀datetime的记录列中的所有记录_Google Bigquery - Fatal编程技术网

Google bigquery 如何查询要嵌套在带有后缀datetime的记录列中的所有记录

Google bigquery 如何查询要嵌套在带有后缀datetime的记录列中的所有记录,google-bigquery,Google Bigquery,我有一个带有字段标记日期的表(类型:RECORD) 标记日期具有以下属性: 我不知道designer是如何创建此表的,但数据是这样保存的: 我必须用一些datetime过滤器对所有挣到的积分求和 我必须把所有挣来的钱加起来 下面是BigQuery标准SQL #standardSQL SELECT userID, SUM(markedDates.d_2018_11_30.earnedPoint) AS allEarnedPoint FROM `project.dataset.table` GRO

我有一个带有字段
标记日期的表(类型:RECORD)

标记日期
具有以下属性:

我不知道designer是如何创建此表的,但数据是这样保存的:

我必须用一些datetime过滤器对所有挣到的积分求和 我必须把所有挣来的钱加起来

下面是BigQuery标准SQL

#standardSQL
SELECT userID, SUM(markedDates.d_2018_11_30.earnedPoint) AS allEarnedPoint
FROM `project.dataset.table`
GROUP BY userID     
…带有一些日期时间过滤器

看不到任何可用于此类筛选的日期时间相关字段


问题是在标记日期中还有其他日期时间:d_2018_09_08,d_2019_09_09,…我还必须在其他日期合计已赢得的积分

下面就是诀窍

#standardSQL
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`, 
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID   
您可以使用过于简化的虚拟数据来测试、处理上述内容,我希望这些数据能够代表您的情况

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 userID, 
    STRUCT(
      STRUCT(0 AS earnedPoint, TRUE AS earnedShare) AS d_2018_11_30,
      STRUCT(1 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_01,
      STRUCT(2 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_02,
      STRUCT(3 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_03,
      STRUCT(4 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_04,
      STRUCT(5 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_05,
      STRUCT(6 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_06
    ) markedDates
)
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`, 
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID    
注意:你们的数据应该按原样工作——但即使你们需要做一些调整——你们也应该从上面得到好主意

我必须把所有挣来的钱加起来

下面是BigQuery标准SQL

#standardSQL
SELECT userID, SUM(markedDates.d_2018_11_30.earnedPoint) AS allEarnedPoint
FROM `project.dataset.table`
GROUP BY userID     
…带有一些日期时间过滤器

看不到任何可用于此类筛选的日期时间相关字段


问题是在标记日期中还有其他日期时间:d_2018_09_08,d_2019_09_09,…我还必须在其他日期合计已赢得的积分

下面就是诀窍

#standardSQL
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`, 
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID   
您可以使用过于简化的虚拟数据来测试、处理上述内容,我希望这些数据能够代表您的情况

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 userID, 
    STRUCT(
      STRUCT(0 AS earnedPoint, TRUE AS earnedShare) AS d_2018_11_30,
      STRUCT(1 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_01,
      STRUCT(2 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_02,
      STRUCT(3 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_03,
      STRUCT(4 AS earnedPoint, FALSE AS earnedShare) AS d_2018_12_04,
      STRUCT(5 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_05,
      STRUCT(6 AS earnedPoint, TRUE AS earnedShare) AS d_2018_12_06
    ) markedDates
)
SELECT userID, SUM(CAST(JSON_EXTRACT(REGEXP_EXTRACT(x, r'"d_.*?":(.*)'), '$.earnedPoint') AS FLOAT64)) allEarnedPoint
FROM `project.dataset.table`, 
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(markedDates), r'"d_.*?":{.*?}')) x
WHERE REGEXP_EXTRACT(x, r'"d_(.*?)"') BETWEEN '2018_12_02' AND '2018_12_05'
GROUP BY userID    

注意:数据应按原样处理-但即使您需要进行一些调整-您也应从上面获得好主意

预期输出是什么?在该表中有一个字段
userID
。预期输出是使用datetime筛选器的用户所有挣点的总和。(>=2019/09/10,哪个字段用于datetime筛选器?我想我们必须使用
markedDates.d_2018_11_30
。在它里面没有任何datetime字段。它是一个记录,没有任何看起来像date或datetime的内容,而不是字段本身的名称。预期输出是什么?在这个表中有一个字段
userID
。预期输出put是使用datetime筛选器计算的用户所有已挣点数的总和。(>=2019/09/10,哪个字段用于datetime筛选器?我认为我们必须使用
标记日期。d_2018_11_30
。在它里面没有任何datetime字段。它是一个没有任何类似日期或日期时间的内容的记录,而不是字段本身的名称。问题是在标记日期中还有其他日期时间。:d_2018_09_08,d_2019_09_09,…所以,u更新您的问题以清楚地呈现您的情况!我尝试了您的查询,但得到一个错误:
错误的双精度值:{\'float\':1,\'integer\':null\“provi…
。看起来,
earnedPoint
是空的。问题在于标记日期还有其他日期时间:d_2018_09_08,d_2019_09_09,…所以,更新你的问题以清楚地说明你的情况!我尝试了你的查询,但得到一个错误:
错误的双精度值:{“float\”:1,\“integer\”:空\“provi…
。看起来,
earnedPoint
为空。