Google bigquery 从单元格中的JSON中提取最后一项

Google bigquery 从单元格中的JSON中提取最后一项,google-bigquery,Google Bigquery,我有一个名为submission\u date的列,其中包含json单元格,如下所示: {"submitted":["January 24, 2019","January 25, 2019","January 30, 2019","February 27, 2019"],"submission_canceled":["January 24, 2019","January 25, 2019"],"returned":"February 19, 2019"} {"submitted":["Feb

我有一个名为
submission\u date
的列,其中包含json单元格,如下所示:

{"submitted":["January 24, 2019","January 25, 2019","January 30,
2019","February 27, 2019"],"submission_canceled":["January 24, 
2019","January 25, 2019"],"returned":"February 19, 2019"}
{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}
或者像这样:

{"submitted":["January 24, 2019","January 25, 2019","January 30,
2019","February 27, 2019"],"submission_canceled":["January 24, 
2019","January 25, 2019"],"returned":"February 19, 2019"}
{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}
通过执行以下操作,我可以很容易地从“submission_Cancelled”字段获得第一个结果:

json_extract(submission_date, "$.submission_canceled[0]")
我想如果我想保持价值,我会:

json_extract(submission_date, "$.submission_canceled[-1]")

但这只是给了我一个空值。如您所见,有时
submission\u cancelled
字段在列表中会有多个日期,而其他时候它只会有一个日期,而不在列表中。我想从
submission\u cancelled
部分获取列表中的单个项目或最后一个项目。

下面的示例适用于BigQuery标准SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, '{"submitted":["January 24, 2019","January 25, 2019","January 30, 2019","February 27, 2019"],"submission_canceled":["January 24,  2019","January 25, 2019"],"returned":"February 19, 2019"}' submission_date UNION ALL
  SELECT 2, '{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}'
)
SELECT id, REGEXP_REPLACE(ARRAY_REVERSE(SPLIT(JSON_EXTRACT(submission_date, '$.submission_canceled'), '","'))[OFFSET(0)], r'"|\[|\]', '') last_submission_canceled
FROM `project.dataset.table`
结果

Row id  last_submission_canceled     
1   1   January 25, 2019     
2   2   March 5, 2019    
更新-下面是“更轻”的版本

结果显然是一样的

Row id  last_submission_canceled     
1   1   January 25, 2019     
2   2   March 5, 2019    

下面是BigQuery标准SQL的示例

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 id, '{"submitted":["January 24, 2019","January 25, 2019","January 30, 2019","February 27, 2019"],"submission_canceled":["January 24,  2019","January 25, 2019"],"returned":"February 19, 2019"}' submission_date UNION ALL
  SELECT 2, '{"submitted":["February 27, 2019","March 5, 2019"],"submission_canceled":"March 5, 2019"}'
)
SELECT id, REGEXP_REPLACE(ARRAY_REVERSE(SPLIT(JSON_EXTRACT(submission_date, '$.submission_canceled'), '","'))[OFFSET(0)], r'"|\[|\]', '') last_submission_canceled
FROM `project.dataset.table`
结果

Row id  last_submission_canceled     
1   1   January 25, 2019     
2   2   March 5, 2019    
更新-下面是“更轻”的版本

结果显然是一样的

Row id  last_submission_canceled     
1   1   January 25, 2019     
2   2   March 5, 2019    

当然但不幸的是,在BigQuery中解析json受到限制,所以另一个选择是使用js udf来模拟常规的jpath功能——我在这里有很多答案,所以有了这些答案examples@ndevito1-谢谢你“强迫”我再次访问我的答案-见更新-希望现在更简单:o)@ndevito1-你有机会尝试吗?是的,我们能够轻松实现这一点!非常感谢!当然但不幸的是,在BigQuery中解析json受到限制,所以另一个选择是使用js udf来模拟常规的jpath功能——我在这里有很多答案,所以有了这些答案examples@ndevito1-谢谢你“强迫”我再次访问我的答案-见更新-希望现在更简单:o)@ndevito1-你有机会尝试吗?是的,我们能够轻松实现这一点!非常感谢!