Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/google-apps-script/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Google bigquery 将条目等同于其自身的聚合版本_Google Bigquery - Fatal编程技术网

Google bigquery 将条目等同于其自身的聚合版本

Google bigquery 将条目等同于其自身的聚合版本,google-bigquery,Google Bigquery,我试图找出条目的值是否为分组值的最大值。它的目的是坐在一个更大的if逻辑中 我希望它看起来像这样: SELECT t.id as t_id, sum(if(t.value = max(t.value), 1, 0)) AS is_max_value FROM dataset.table AS t GROUP BY t_id 答复是: Error: Expression 't.value' is not present in the GROUP BY list 我的代码应该

我试图找出条目的值是否为分组值的最大值。它的目的是坐在一个更大的
if
逻辑中

我希望它看起来像这样:

SELECT
    t.id as t_id, 
    sum(if(t.value = max(t.value), 1, 0)) AS is_max_value

FROM dataset.table AS t
GROUP BY t_id
答复是:

Error: Expression 't.value' is not present in the GROUP BY list

我的代码应该如何做到这一点?

首先需要在子查询中编译max值,然后再次将该值连接到表中

使用此处可用的公共数据集是一个示例:

SELECT
  t.word,
  t.word_count,
  t.corpus_date
FROM
  [publicdata:samples.shakespeare] t
JOIN (
  SELECT
    corpus_date,
    MAX(word_count) word_count,
  FROM
    [publicdata:samples.shakespeare]
  GROUP BY
    1 ) d
ON
  d.corpus_date=t.corpus_date
  AND t.word_count=d.word_count
LIMIT
  25
结果:

+-----+--------+--------------+---------------+---+
| Row | t_word | t_word_count | t_corpus_date |   |
+-----+--------+--------------+---------------+---+
|   1 | the    |          762 |          1597 |   |
|   2 | the    |          894 |          1598 |   |
|   3 | the    |          841 |          1590 |   |
|   4 | the    |          680 |          1606 |   |
|   5 | the    |          942 |          1607 |   |
|   6 | the    |          779 |          1609 |   |
|   7 | the    |          995 |          1600 |   |
|   8 | the    |          937 |          1599 |   |
|   9 | the    |          738 |          1612 |   |
|  10 | the    |          612 |          1595 |   |
|  11 | the    |          848 |          1592 |   |
|  12 | the    |          753 |          1594 |   |
|  13 | the    |          740 |          1596 |   |
|  14 | I      |          828 |          1603 |   |
|  15 | the    |          525 |          1608 |   |
|  16 | the    |          363 |             0 |   |
|  17 | I      |          629 |          1593 |   |
|  18 | I      |          447 |          1611 |   |
|  19 | the    |          715 |          1602 |   |
|  20 | the    |          717 |          1610 |   |
+-----+--------+--------------+---------------+---+
您可以看到,在由
corpus\u date

定义的分区中保留具有最大
word计数的
word
,使用窗口函数将最大值“分散”到所有相关记录上。 这样可以避免连接

SELECT
  *
FROM (
  SELECT
    corpus,
    corpus_date,
    word,
    word_count,
    MAX(word_count) OVER (PARTITION BY corpus) AS Max_Word_Count
  FROM
    [publicdata:samples.shakespeare] )
WHERE
  word_count=Max_Word_Count
说明:

内部选择-为每一行/记录计算具有相同id的所有行中的最大值
外部选择-对于每一行/记录,将行的值与相应组的最大值进行比较,然后将true或false分别转换为1或0(根据相关预期)

select 
  id, 
  value, 
  integer(value = max_value) as is_max_value
from (
  select id, value, max(value) over(partition by id) as max_value
  from dataset.table
)