Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/.htaccess/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hive 配置单元SQL。跨多个列查找最常用的值_Hive_Hiveql - Fatal编程技术网

Hive 配置单元SQL。跨多个列查找最常用的值

Hive 配置单元SQL。跨多个列查找最常用的值,hive,hiveql,Hive,Hiveql,我有以下资料: name device operating browser A mob l c A mob l b A mob l b A web w b B web w c B web w c B mob w c B web l b 我想为

我有以下资料:

name device operating browser
 A     mob      l       c
 A     mob      l       b
 A     mob      l       b
 A     web      w       b
 B     web      w       c
 B     web      w       c
 B     mob      w       c
 B     web      l       b
我想为每列中的每个名称找到最常见的值,结果如下所示:

name device operating browser
 A     mob      l       b
 B     web      w       c
我怎样才能做到这一点?谢谢

这可能会有帮助。 但请注意,子查询并不是很好用的

SELECT
a.name,
(SELECT b.device FROM YOUR_TABLE_NAME b WHERE b.name = a.name GROUP BY device ORDER BY COUNT(b.device) DESC LIMIT 1) AS device,
(SELECT c.operating FROM YOUR_TABLE_NAME c WHERE c.name = a.name GROUP BY operating ORDER BY COUNT(c.operating) DESC LIMIT 1) AS operating,
(SELECT d.browser FROM YOUR_TABLE_NAME d WHERE d.name = a.name GROUP BY browser ORDER BY COUNT(d.browser) DESC LIMIT 1) AS browser
FROM YOUR_TABLE_NAME AS a
GROUP BY a.name

对于Hive 0.11+,您可以使用窗口功能,如
rank

select name, device, operating, browser
from (
  select *, rank() over (partition by name order by cnt desc) as rnk
  from (
    select name, device, operating, browser, count(*) as cnt
    from yourtable
    group by name, device, operating, browser
  ) t
) t
where rnk = 1
逐步:

  • 计算相同行值的出现次数
  • 将他们排在计数之上,每个名字排在最前面
  • 只过滤那些具有最高计数的

  • 注意:如果特定名称中有一个tie,它将返回所有具有相同计数号的行。

    欢迎使用StackOverflow。我们不是免费的编码服务。请看一看和。如果您在编写代码时遇到了特定问题,请随时提问。