Google bigquery BigQuery标准SQL：使用按列分组的分区_Google Bigquery

Google bigquery BigQuery标准SQL：使用按列分组的分区

google-bigquery

Google bigquery BigQuery标准SQL：使用按列分组的分区,google-bigquery,Google Bigquery,我尝试使用一个布尔列进行分区，我也使用它来分组。柱是应用函数的结果，而不是有机柱对于旧式SQL，使用partition by子句中的列名可以实现这一点。在标准SQL中，不可能使用列名，并且在重新写入列定义时会出现错误 #standardSQL SELECT corpus = 'sonnets' sonnetsCorp, count(distinct word) cnt, count(distinct word)/sum(count(distinct word)) over (parti

我尝试使用一个布尔列进行分区，我也使用它来分组。柱是应用函数的结果，而不是有机柱

对于旧式SQL，使用partition by子句中的列名可以实现这一点。在标准SQL中，不可能使用列名，并且在重新写入列定义时会出现错误

#standardSQL
SELECT 
 corpus = 'sonnets' sonnetsCorp,
 count(distinct word) cnt,
 count(distinct word)/sum(count(distinct word)) over (partition by corpus = 'sonnets') ratio
FROM `bigquery-public-data.samples.shakespeare` 
 group by 1

我得到一个错误：

Unrecognized name: sonnetsCorp at [5:68]

您将需要使用带有标准SQL的子查询。遗留SQL支持一些非标准功能，这些功能在某些情况下可能会出现故障

#standardSQL
SELECT
 sonnetsCorp,
 count(distinct word) cnt,
 count(distinct word)/sum(count(distinct word)) over (partition by sonnetsCorp) ratio
FROM (
  SELECT
   *,
   corpus = 'sonnets' AS sonnetsCorp
  FROM `bigquery-public-data.samples.shakespeare`
)
GROUP BY sonnetsCorp;

柱是应用函数的结果，而不是有机柱

下面是在BigQuery标准SQL中表示此case派生列的一种简单方法，因此使用遗留SQL编写的查询几乎可以保持原样，而无需引入额外的子查询

标准SQL 选择十四行诗公司，可数字cnt， COUNTDISTINCT字/SUMCOUNTDISTINCT字按SonnetCorp比率划分从'bigquery public data.samples.shakespeare'，UNNEST[corpus='sonnets']作为sonnetsCorp sonnetsCorp集团上面是UNNEST[corpus='sonnets']，因为sonnetsCorp看起来像交叉连接，但实际上它只是基于每行计算的派生列

对于旧式SQL，使用partition by子句中的列名可以实现这一点

同时，我觉得在您的问题中，您提供了一个在遗留SQL中实际使用的查询。为了显示派生列的问题，您可能过于简化了它-在本例中，请忽略下面的内容。但是，如果这个查询正是您所使用的，那么它就没有多大意义，下面的查询也会这样做

标准SQL 选择语料库=‘十四行诗’作为十四行诗公司，可数分词来自“bigquery-public-data.samples.shakespeare”` sonnetsCorp集团构造比率字段的方式使其始终等于1