Hive 前N个排序行,是否按分组?

Hive 前N个排序行,是否按分组?,hive,apache-spark-sql,hiveql,top-n,Hive,Apache Spark Sql,Hiveql,Top N,我有以下交易表: 我正在按客户id和类别进行分组,以创建产品id-score映射对列表: SELECT s.customer_id, s.category, collect_list(s.pair) FROM ( SELECT customer_id, category, map(product_id, score) AS pair FROM

我有以下交易表:

我正在按客户id和类别进行分组,以创建产品id-score映射对列表:

SELECT
    s.customer_id,    
    s.category,
    collect_list(s.pair) 
FROM
    (
        SELECT
            customer_id,
            category,
            map(product_id, score) AS pair
        FROM
            transaction
        WHERE
            score > {score_threshold}
    ) s 
GROUP BY
    s.customer_id,
    s.category
现在我想更进一步。对于每个组,我希望只保留前n对,按分数降序排序。我尝试过按…顺序进行过度分区,但遇到了问题

注意:事务表是按类别划分的

谢谢

试试这个:

SELECT
        s.customer_id,    
        s.category,
        collect_list(s.pair) 
    FROM
        (
            SELECT
                ROW_NUMBER() OVER (PARTITION BY customer_id, category ORDER BY score desc) as RowId 
                customer_id,
                category,
                map(product_id, score) AS pair
            FROM
                transaction
            WHERE
                score > {score_threshold}
        ) s 
        where s.RowId < n
    
    GROUP BY
        s.customer_id,
        s.category
SELECT
        s.customer_id,    
        s.category,
        collect_list(s.pair) 
    FROM
        (
            SELECT
                ROW_NUMBER() OVER (PARTITION BY customer_id, category ORDER BY score desc) as RowId 
                customer_id,
                category,
                map(product_id, score) AS pair
            FROM
                transaction
            WHERE
                score > {score_threshold}
        ) s 
        where s.RowId < n
    
    GROUP BY
        s.customer_id,
        s.category