Google bigquery BigQuery-提取文本和子文本

Google bigquery BigQuery-提取文本和子文本,google-bigquery,text-extraction,Google Bigquery,Text Extraction,我在BigQuery中有一个关于每天发送的合同数量的表: date contract 2014-05-04 {jeans = 5, caps = 12, CDs = 1, Microwaves = 7, other = 6} 2014-05-05 {cups = 7, other = 5} 我需要申报发出的未分类合同数量。到现在为止,我通过下载CSV并在Excel中计算出来来实现这一点 我怎样才能在BQ内得到这样一张表: date other_contracts

我在BigQuery中有一个关于每天发送的合同数量的表:

date        contract
2014-05-04  {jeans = 5, caps = 12, CDs = 1, Microwaves = 7, other = 6}
2014-05-05  {cups = 7, other = 5}
我需要申报发出的未分类合同数量。到现在为止,我通过下载CSV并在Excel中计算出来来实现这一点

我怎样才能在BQ内得到这样一张表:

date        other_contracts
2014-05-04   6
2014-05-05   5

谢谢

使用regexp\u提取并查找数字序列

SELECT
  *,
  REGEXP_EXTRACT(contract,r'other = (\d+)') AS Other
FROM (
  SELECT
    "2014-05-04" AS Date,
    "{jeans = 5, caps = 12, CDs = 1, Microwaves = 7, other = 6}" AS contract),
  (
  SELECT
    "2014-05-05" AS Date,
    "{cups = 7, other = 5}" AS contract)

使用regexp\u提取并查找数字序列

SELECT
  *,
  REGEXP_EXTRACT(contract,r'other = (\d+)') AS Other
FROM (
  SELECT
    "2014-05-04" AS Date,
    "{jeans = 5, caps = 12, CDs = 1, Microwaves = 7, other = 6}" AS contract),
  (
  SELECT
    "2014-05-05" AS Date,
    "{cups = 7, other = 5}" AS contract)

如果可以将第一个表中的数据格式更改为JSON,则可以使用

如果可以将第一个表中的数据格式更改为JSON,则可以使用更通用的方法。 我认为这会有帮助:

SELECT 
  Date,
  INTEGER(REGEXP_EXTRACT(Item, r'(\d+)')) AS Count,
  REGEXP_EXTRACT(Item, r'(\w+)') AS Item
FROM (
  SELECT Date, SPLIT(contract) as Item
  FROM 
    (SELECT "2014-05-04" AS Date, "{jeans = 5, caps = 12, CDs = 1, Microwaves = 7, other = 6}" AS contract),
    (SELECT "2014-05-05" AS Date, "{cups = 7, other = 5}" AS contract)
)
ORDER BY Date, Count DESC
结果是:

Date    Count   Item
5/4/2014    12  caps
5/4/2014    7   Microwaves
5/4/2014    6   other
5/4/2014    5   jeans
5/4/2014    1   CDs
5/5/2014    7   cups
5/5/2014    5   other
更通用的方法。 我认为这会有帮助:

SELECT 
  Date,
  INTEGER(REGEXP_EXTRACT(Item, r'(\d+)')) AS Count,
  REGEXP_EXTRACT(Item, r'(\w+)') AS Item
FROM (
  SELECT Date, SPLIT(contract) as Item
  FROM 
    (SELECT "2014-05-04" AS Date, "{jeans = 5, caps = 12, CDs = 1, Microwaves = 7, other = 6}" AS contract),
    (SELECT "2014-05-05" AS Date, "{cups = 7, other = 5}" AS contract)
)
ORDER BY Date, Count DESC
结果是:

Date    Count   Item
5/4/2014    12  caps
5/4/2014    7   Microwaves
5/4/2014    6   other
5/4/2014    5   jeans
5/4/2014    1   CDs
5/5/2014    7   cups
5/5/2014    5   other