Google bigquery 如何从bigquery中的另一个表中获取短语列表的表字段中的匹配计数？_Google Bigquery

Google bigquery 如何从bigquery中的另一个表中获取短语列表的表字段中的匹配计数？

google-bigquery

Google bigquery 如何从bigquery中的另一个表中获取短语列表的表字段中的匹配计数？,google-bigquery,Google Bigquery,给定短语的任意列表短语1，短语2*。。。phraseN说这些是在另一个表Phrase_表中，如何获得bigquery表中字段F中每个短语的匹配计数这里，*表示短语后面必须有一些非空/非空字符串假设您有一个带有and ID字段和两个字符串字段Field1、Field2的表输出将类似于 id，第1个Infield1段的计数，第2个Infield1段的计数，第1个Infield2段的计数，第2个Infield2段的计数或者我猜不是所有这些输出字段，而是一个json对象字段 id，[{field

给定短语的任意列表短语1，短语2*。。。phraseN说这些是在另一个表Phrase_表中，如何获得bigquery表中字段F中每个短语的匹配计数

这里，*表示短语后面必须有一些非空/非空字符串

假设您有一个带有and ID字段和两个字符串字段Field1、Field2的表

输出将类似于

id，第1个Infield1段的计数，第2个Infield1段的计数，第1个Infield2段的计数，第2个Infield2段的计数

或者我猜不是所有这些输出字段，而是一个json对象字段

id，[{fieldName:Field1，counts:{phrase1:m，phrase2:mm，…}， {fieldName:Field2，计数：{phrase1:m2，phrase2:mm2，…}，]

谢谢！

下面的示例是针对BigQuery标准SQL的

#standardSQL
WITH `project.dataset.table` AS (
SELECT 'foo1 foo foo40' str UNION ALL
SELECT 'test1 test test2 test'
), `project.dataset.keywords` AS (
  SELECT 'foo' key UNION ALL
  SELECT 'test'
)
SELECT str, ARRAY_AGG(STRUCT(key, ARRAY_LENGTH(REGEXP_EXTRACT_ALL(str, CONCAT(key, r'[^\s]'))) as matches)) all_matches
FROM `project.dataset.table` 
CROSS JOIN `project.dataset.keywords`
GROUP BY str

结果

Row str                     all_matches.key all_matches.matches  
1   foo1 foo foo40          foo             2    
                            test            0    
2   test1 test test2 test   foo             0    
                            test            2

如果您喜欢输出为json，可以添加到_json_字符串中，如下例所示

#standardSQL
WITH `project.dataset.table` AS (
SELECT 'foo1 foo foo40' str UNION ALL
SELECT 'test1 test test2 test'
), `project.dataset.keywords` AS (
  SELECT 'foo' key UNION ALL
  SELECT 'test'
)
SELECT str, TO_JSON_STRING(ARRAY_AGG(STRUCT(key, ARRAY_LENGTH(REGEXP_EXTRACT_ALL(str, CONCAT(key, r'[^\s]'))) as matches))) all_matches
FROM `project.dataset.table` 
CROSS JOIN `project.dataset.keywords`
GROUP BY str

有输出

Row str                     all_matches  
1   foo1 foo foo40          [{"key":"foo","matches":2},{"key":"test","matches":0}]   
2   test1 test test2 test   [{"key":"foo","matches":0},{"key":"test","matches":2}]

有无数的方式来展示上面的输出-希望你能调整它以满足你的需要：o