Sql 如何按多列分组并聚合红移中的最后一列_Sql_Amazon Redshift

Sql 如何按多列分组并聚合红移中的最后一列

sql amazon-redshift

Sql 如何按多列分组并聚合红移中的最后一列,sql,amazon-redshift,Sql,Amazon Redshift,不确定我是否只是在放屁，因为这个问题看起来很简单： +----------+----------+---------------------+ | user_id | country | country_probability | +----------+----------+---------------------+ | 10000022 | France | 0.126396313 | | 10000022 | Italy | 0.343407512

不确定我是否只是在放屁，因为这个问题看起来很简单：

+----------+----------+---------------------+
| user_id  | country  | country_probability |
+----------+----------+---------------------+
| 10000022 | France   | 0.126396313         |
| 10000022 | Italy    | 0.343407512         |
| 10000022 | England  | 0.161236539         |
| 10000044 | China    | 0.061884698         |
| 10000044 | S. Korea | 0.043251887         |
| 10000044 | Japan    | 0.65095371          |
| 10000046 | USA      | 0.215771168         |
| 10000046 | Canada   | 0.214556068         |
| 10000046 | Mexico   | 0.081350066         |
+----------+----------+---------------------+

在Redshift中，我如何将其分组，以使我的输出为：唯一用户id、最大概率的国家以及该用户id所在国家的概率

这将是：

+----------+---------+---------------------+
| user_id  | country | country_probability |
+----------+---------+---------------------+
| 10000022 | Italy   | 0.343407512         |
| 10000044 | Japan   | 0.65095371          |
| 10000046 | USA     | 0.215771168         |
+----------+---------+---------------------+

谢谢你，如果这是一个重复的职位，对不起。。。我试着搜索，但没能找到多少。分组函数在红移和MySQL中的工作方式似乎有所不同…

可能是这样的

select user_id, country, country_probability
from your_table
where (user_id, country_probability) in 
      (select user_id, max(country_probability)
       from test
       group by user_id
      )

[编辑：另一个选项，使用分析排名函数]

select user_id, country, country_probability
from (select user_id, country, 
        country_probability,
        rank() over (partition by user_id order by country_probability desc) rnk
        from your_table
     )
where rnk = 1;

将Littlefoot方法的性能与以下方法进行比较是很有趣的：

select distinct user_id,
       first_value(country) over (partition by user_id order by country_probability desc),
       min(country_probability) over (partition by user_id)
from t;

我通常不喜欢使用

select distinct

进行聚合，但红移只支持将

first\u value（）

作为窗口函数。

谢谢你，这很管用。这真的是最好的方法吗？我也以类似的方式解决了这个问题，但仅仅获得这样一个简单的输出似乎不必要地困难（执行子查询）……另一个选项是使用分析函数标记行，但我甚至不知道这些是否对您可用。对于这个请求，这是一个非常典型的查询，你会发现它到处都是。我想你可以尝试使用分析函数（正如@shawnt00所说的），比如RANK。我编辑了我的答案，试一试。谢谢你的回复Littlefoot。我还用秩函数解了它。。。我想我只是想确认我没有把这件事复杂化。