Sql 蜂巢左连接不工作
“我的用户”表包含以下列数据Sql 蜂巢左连接不工作,sql,join,hive,Sql,Join,Hive,“我的用户”表包含以下列数据 region,cust no,mobileno,null,host,null,usage,null,usageduration AP 404070620021081 Prepaid 919848052151 NULL Facebook NULL 2.9384765625 NULL 1.726 AP 404070620021081 Prepaid 919848052151 NULL HTTP
region,cust no,mobileno,null,host,null,usage,null,usageduration
AP 404070620021081 Prepaid 919848052151 NULL Facebook NULL 2.9384765625 NULL 1.726
AP 404070620021081 Prepaid 919848052151 NULL HTTP NULL 1.0146484375 NULL 0.232
AP 404070620021081 Prepaid 919848052151 NULL Bing NULL 8.8642578125 NULL 0.746
AP 404070620021081 Prepaid 919848052151 NULL Crashlytics NULL 19.4599609375 NULL 48.765
AP 404070620021081 Prepaid 919848052151 NULL DNS NULL 17.4296875 NULL 584.596
AP 404070620021081 Prepaid 919848052151 NULL Doubleclick NULL 6.908203125 NULL 1.362
AP 404070620021081 Prepaid 919848052151 NULL Dropbox NULL 37.0380859375 NULL 42.174
AP 404070620021081 Prepaid 919848052151 NULL Facebook NULL 21.1533203125 NULL 29.689
AP 404070620021081 Prepaid 919848052151 NULL Google NULL 49.0732421875 NULL 28.456
AP 404070620021081 Prepaid 919848052151 NULL Google APIs NULL 213.8642578125 NULL 49.866
AP 404070620021081 Prepaid 919848052151 NULL Google Ads NULL 5.7314453125 NULL 0.932
AP 404070620021081 Prepaid 919848052151 NULL Google Calendar NULL 0.201171875 NULL 0.06
AP 404070620021081 Prepaid 919848052151 NULL Google Cloud Messaging NULL 8.5419921875 NULL 143.50799999999998
AP 404070620021081 Prepaid 919848052151 NULL Google Play NULL 228.7880859375 NULL 88.77600000000001
AP 404070620021081 Prepaid 919848052151 NULL HTTP NULL 0.29296875 NULL 1.16
AP 404070620021081 Prepaid 919848052151 NULL NTP NULL 0.1484375 NULL 0.122
AP 404070620021081 Prepaid 919848052151 NULL SSL NULL 96.095703125 NULL 452.88
AP 404070620021081 Prepaid 919848052151 NULL Skype NULL 93.6953125 NULL 67.649
AP 404070620021081 Prepaid 919848052151 NULL TCP NULL 93.591796875 NULL 117.32900000000001
AP 404070620021081 Prepaid 919848052151 NULL WhatsApp NULL 165780.6171875 NULL 1097.055
AP 404070620021081 Prepaid 919848052151 NULL XMPP NULL 62.4453125 NULL 350.03700000000003
my top20 table contains host,rank
SSL 1
TCP 2
DNS 3
HTTP 4
Facebook 5
Google Play 6
Google Cloud Messaging 7
YouTube 8
UDP 9
XMPP 10
Skype 11
WhatsApp 12
Bittorrent 13
Google 14
STUN 15
Google APIs 16
Doubleclick 17
Apple 18
MDNS 19
Google Ads 20
我需要这些前20名网站的使用情况和每个客户的持续时间。如果客户未使用,则应显示每个客户需要的0行,但20行。我做了左连接,但使用所有组合得到420行。这是错误的。请建议为每个客户获得20行。一种方法是使用交叉连接找到客户号和主机的所有组合,然后使用左连接用户表
select t.cust_no,
t.host,
coalesce(u.usage, 0) usage,
coalesce(u.usageduration, 0) usageduration
from (
select *
from (
select distinct cust_no
from user
) u
cross join top20 t
) t
left join user u on t.cust_no = u.cust_no
and t.host = u.host;
嗨,我创建了一个查询,如下所示
SELECT
imsi,
msisdn,Subscription_plan,
cust_nationality,
application_name,
rank,is_app,
total_data_volume_host,
SUM(total_data_volume_app) AS total_data_volume_app,
event_duration_host,
SUM(event_duration_app) AS event_duration_app
FROM (SELECT
ps_data.cust_nationality,
ps_data.imsi,ps_data.Subscription_plan,
ps_data.msisdn,
ps_data.host,
top20.host_app AS application_name,
top20.rank,
top20.is_app,
ps_data.total_data_volume_host,
ps_data.total_data_volume_app,
ps_data.event_duration_host,
ps_data.event_duration_app
FROM (SELECT
circle,
host_app,
rank,
is_app
FROM ps_top20_host_app_1_month
WHERE is_app = '1') top20
LEFT JOIN (SELECT
cust_nationality,
imsi,Subscription_plan,
msisdn,
NULL AS host,
application_name,
NULL AS total_data_volume_host,
SUM(COALESCE(total_data_volume,0)) / 1024 AS total_data_volume_app,
NULL AS event_duration_host,
SUM(COALESCE(data_transfer_time_dl, 0)/1000 + COALESCE(data_transfer_time_ul, 0)/1000) AS event_duration_app
FROM ps_data_up_segg_1_day
WHERE content_provider='1' and cust_nationality is not null and UPPER(cust_nationality) not in ('UNKNOWN','NULL IN SOURCE') and
msisdn is not null and UPPER(msisdn) not in ('UNKNOWN','NULL IN SOURCE') AND dt >= '1487615400000'AND dt < '1487701800000' AND imsi='404070620021081'
GROUP BY cust_nationality,
imsi,
msisdn,Subscription_plan,
host,
application_name) ps_data
ON (
top20.circle = ps_data.cust_nationality) where ( top20.host_app is not null and top20.host_app=ps_data.application_name ) )t2
GROUP BY imsi,
msisdn,Subscription_plan,
cust_nationality,
host,
application_name,
rank,is_app,
total_data_volume_host,
event_duration_host
但我只得到14行,这是共同的,不是所有的20
404070620021081 919848052151预付AP DNS 3 1空17.4296875空584.596
404070620021081 919848052151预付AP双击17 1空6.908203125空1.362
404070620021081 919848052151预付AP Facebook 5 1空24.091796875空31.415
404070620021081 919848052151预付AP谷歌14 1空49.0732421875空28.456
404070620021081 919848052151预付AP谷歌API 16 1空213.8642578125空49.866
404070620021081 919848052151预付AP谷歌广告20 1空5.7314453125空0.932
404070620021081 919848052151预付AP谷歌云消息7 1空8.5419921875空143.507999999998
404070620021081 919848052151预付AP Google Play 6 1空228.7880859375空88.7760000000001
404070620021081 919848052151预付AP HTTP 4 1空1.3076171875空1.392
404070620021081 919848052151预付AP SSL 1 1空96.095703125空452.88
404070620021081 919848052151预付AP Skype 11 1空93.6953125空67.649
404070620021081 919848052151预付AP TCP 2 1空93.591796875空117.3290000000001
404070620021081 919848052151预付AP WhatsApp 12 1空165780.6171875空1097.055
404070620021081 919848052151预付AP XMPP 10 1空62.4453125空350.0370000000003
但这对这个用户来说是错误的,我需要20条记录,其中未使用的主机使用率为0,我创建了一个查询