Sql BigQuery中Top N查询的推广
这是一个后续问题,用于概括案例。现在让我们获取以下数据:Sql BigQuery中Top N查询的推广,sql,google-bigquery,pivot,pivot-table,Sql,Google Bigquery,Pivot,Pivot Table,这是一个后续问题,用于概括案例。现在让我们获取以下数据: year genre studio title revenue 2014 fantasy fox avatar 10 2015 fantasy fox avatar 12 2016 fantasy fox avatar
year genre studio title revenue
2014 fantasy fox avatar 10
2015 fantasy fox avatar 12
2016 fantasy fox avatar 12
2015 action sony spider-man 10
2015 romance paramount love letter 15
2015 action sony spider-man 10
2015 action sony spider-man 10
2015 action disney toy story 10
2015 action sony edgar 4
2015 action sony thomas 1
2015 fantasy fox avatar 2
我希望得到以下结果来构建树结构:
Past 2 years, Top 2 genres (Alphabetically), Top 2 studios (by Count), Top 2 titles by SUM Revenue DESC
因此,我们会得到如下结果:
从概念上讲,我希望实现的查询是这样的:
SELECT year, genre, studio, title, SUM(revenue)
FROM titles
GROUP BY year, genre, studio, title
// in pseudocode
ORDER BY
(year DESC) LIMIT 2,
(genre ASC) LIMIT 10,
(COUNT(studio) DESC) LIMIT 2,
(SUM(revenue) DESC) LIMIT 2
做上述工作的最佳方法是什么,这更像是在BQ中构建树结构的一般化。我无法在您的数据集中找到“avatar2”,但结果中有它。因此,我无法验证边缘的答案。这是我提出的SQL Server查询。我希望不会有太多的变化需要
WITH A as
(SELECT
year,
genre,
studio,
COUNT(*) OVER (PARTITION BY year, genre, studio) AS studio_movie_count,
title,
revenue,
SUM(revenue) OVER (PARTITION BY year, genre, studio,title) AS revenue_sum FROM movies),
B as
(SELECT
year,
DENSE_RANK() OVER (ORDER BY year DESC) AS year_num,
genre,
DENSE_RANK() OVER (PARTITION BY year ORDER BY genre ASC) AS genre_num,
studio,
DENSE_RANK() OVER (PARTITION BY year, genre ORDER BY studio_movie_count DESC) AS studio_num,
title,
DENSE_RANK() OVER (PARTITION BY year, genre, studio ORDER BY revenue_sum DESC) AS title_num,
revenue
FROM A)
SELECT year, genre, studio, title, revenue
FROM B
WHERE year_num < 3 AND
genre_num < 3 AND
studio_num < 3 AND
title_num < 3;
在子查询中筛选前2年的行,同时按工作室查找电影计数,按标题查找收入总和 然后按流派、工作室、收入和过滤器排名前2名
select year, genre, studio, title, revenue
from (
select year, genre, studio, title, revenue,
dense_rank() over (partition by year order by genre) as genre_rank,
dense_rank() over (partition by year, genre order by count_by_studio desc) as studio_rank,
dense_rank() over (partition by year, genre, studio order by revenue_by_title desc) as title_rank
from (
select year,
genre,
studio,
title,
revenue,
dense_rank() over (order by year desc) as year_rank,
count(*) over (partition by year, genre, studio) as count_by_studio,
sum(revenue) over (partition by year, genre, studio, title) as revenue_by_title
from titles
) where year_rank <= 2
) where genre_rank <= 2
and studio_rank <= 2
and title_rank <= 2;
你的第一个问题已经没有什么答案了!你试过自己概括一下吗?你应该表现出一些努力,否则看起来你只是在外包你的工作tasks@MikhailBerlyant我在这里加了一笔赏金。我一直在使用array_agg使用您的方法来实现这一点,但我认为另一种方法可能更通用于处理各种类型/级别。