Sql 如何通过一个不与表交叉的值是SAS例如
我有两张桌子: 第一个是客户id和店铺id:每个客户都有几个他访问过的店铺id。 第二个有所有店铺标识。 我需要从表1中获取客户访问过的随机商店id。它可能是表1中的minshop id 以及客户未访问过的表2中的随机商店id。 似乎交叉连接可以帮助:Sql 如何通过一个不与表交叉的值是SAS例如,sql,sas,proc-sql,cross-join,enterprise-guide,Sql,Sas,Proc Sql,Cross Join,Enterprise Guide,我有两张桌子: 第一个是客户id和店铺id:每个客户都有几个他访问过的店铺id。 第二个有所有店铺标识。 我需要从表1中获取客户访问过的随机商店id。它可能是表1中的minshop id 以及客户未访问过的表2中的随机商店id。 似乎交叉连接可以帮助: proc sql; select a.client_id, min(a.shop_id) as id_1, min(b.shop_id) as id_2 from table_1 a, table_2 b where a.shop_id
proc sql;
select a.client_id, min(a.shop_id) as id_1, min(b.shop_id) as id_2
from table_1 a, table_2 b
where a.shop_id <> b.shop_id
group by 1
;quit;
但问题是表格非常庞大,这种方法需要无限长的时间。
你能帮忙吗?
这里有一个使用左连接的方法:
这里有一个使用Except操作符的方法,假设您还有一个clients表,从所有客户/店铺对的集合中减去访问的店铺集合。如果要排除没有访问过任何店铺或访问过所有店铺的客户,只需将两个左联接更改为常规联接
以下是我使用以下测试脚本在电脑上获得的性能:
原始查询执行时间:32.53秒CPU时间
更新的查询执行时间:0.10秒CPU时间
完整的测试脚本如下
%let shop_count = 1000;
%let client_count = 100;
%let visit_count = 50000;
data shops;
do shop_id = 1 to &shop_count;
output;
end;
run;
data clients;
do client_id = 1 to &client_count;
output;
end;
run;
data client_shop_visits;
do visit_id = 1 to &visit_count;
client_id = rand("Integer", 1, &client_count);
shop_id = rand("Integer", 1, &shop_count);
output;
end;
run;
proc sql;
create table unvisited_shops_original as
select a.client_id, min(a.shop_id) as id_1, min(b.shop_id) as id_2
from client_shop_visits a, shops b
where a.shop_id <> b.shop_id
group by 1
;
run;
proc sql;
create table unvisited_shops_updated as
select c.client_id,
u1.first_unvisited_shop,
v1.first_visited_shop
from clients c
left join ( /* For each client, get the first shop_id they havn't visited */
select u.client_id,
MIN(u.shop_id) as first_unvisited_shop
from (
select c.client_id, /* Get list of all client/shop combinations */
s.shop_id
from clients c
cross join shops s
except /* Remove client/shop combinations that have been visited */
select v.client_id,
v.shop_id
from client_shop_visits v
) u
group by u.client_id
) u1
on u1.client_id = c.client_id
left join ( /* For each client, get the first shop_id they have visited */
select v.client_id,
MIN(v.shop_id) as first_visited_shop
from client_shop_visits v
group by v.client_id
) v1
on v1.client_id = c.client_id
order by c.client_id
;
run;
另一种选择是过滤掉客户没有光顾的商店,然后再经营
monotonic()
它会计算出顾客从未光顾过的商店,然后对顾客做同样的计算,然后simpy加入他们
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FISH AS
SELECT DISTINCT t1.Species,
/* birds_monotonic */
(monotonic()) AS birds_monotonic
FROM SASHELP.FISH t1;
CREATE TABLE WORK.QUERY_FOR_CARS AS
SELECT DISTINCT t1.Make,
t1.Model,
t1.Type,
/* cars_monotonic */
(monotonic()) AS cars_monotonic
FROM SASHELP.CARS t1;
CREATE TABLE WORK.QUERY_FOR_FISH_0000 AS
SELECT DISTINCT t1.Species,
t1.birds_monotonic,
t2.Make,
t2.Model,
t2.Type,
t2.cars_monotonic
FROM WORK.QUERY_FOR_FISH t1
LEFT JOIN WORK.QUERY_FOR_CARS t2 ON (t1.birds_monotonic = t2.cars_monotonic);
QUIT;
商店与顾客商店参观的大致比例是多少?是否有10000家店铺,平均每个客户只访问3次?或者有10家商店,平均每个顾客访问8家?而不是10000家商店,平均每个顾客访问10家
monotonic()
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FISH AS
SELECT DISTINCT t1.Species,
/* birds_monotonic */
(monotonic()) AS birds_monotonic
FROM SASHELP.FISH t1;
CREATE TABLE WORK.QUERY_FOR_CARS AS
SELECT DISTINCT t1.Make,
t1.Model,
t1.Type,
/* cars_monotonic */
(monotonic()) AS cars_monotonic
FROM SASHELP.CARS t1;
CREATE TABLE WORK.QUERY_FOR_FISH_0000 AS
SELECT DISTINCT t1.Species,
t1.birds_monotonic,
t2.Make,
t2.Model,
t2.Type,
t2.cars_monotonic
FROM WORK.QUERY_FOR_FISH t1
LEFT JOIN WORK.QUERY_FOR_CARS t2 ON (t1.birds_monotonic = t2.cars_monotonic);
QUIT;