如何在SAS Enterprise Guide(SAS企业指南)中从数据集中选择行,并将特定条件应用于每个子集
我有下表:如何在SAS Enterprise Guide(SAS企业指南)中从数据集中选择行,并将特定条件应用于每个子集,sas,proc-sql,datastep,Sas,Proc Sql,Datastep,我有下表: COMPANY_NAME | GROUP | COUNTRY | STATUS COM1 | 1 | DE | DELETED COM2 | 1 | DE | REMAINING COM3 | 1 | UK | DELETED COM4 | 2 | ES | DELETED COM5 | 2 | F
COMPANY_NAME | GROUP | COUNTRY | STATUS
COM1 | 1 | DE | DELETED
COM2 | 1 | DE | REMAINING
COM3 | 1 | UK | DELETED
COM4 | 2 | ES | DELETED
COM5 | 2 | FR | DELETED
COM6 | 3 | RO | DELETED
COM7 | 3 | BG | DELETED
COM8 | 3 | ES | REMAINING
COM9 | 3 | ES | DELETED
我需要得到:
COMPANY_NAME | GROUP | COUNTRY | STATUS
COM3 | 1 | UK | DELETED
COM4 | 2 | ES | DELETED
COM5 | 2 | FR | DELETED
COM6 | 3 | RO | DELETED
COM7 | 3 | BG | DELETED
因此,我需要状态已删除的所有条目,并且在每个组中,没有任何公司名称的状态与已删除状态保持在同一国家/地区。我可以使用PROC-SQL或数据步骤 到目前为止,我尝试的是:
PROC SQL;
CREATE TABLE WORK.OUTPUT AS
SELECT *
FROM WORK.INPUT
WHERE STATUS = 'DELETED' AND COUNTRY NOT IN (SELECT COUNTRY FROM WORK.INPUT WHERE STATUS = 'REMAINING');
QUIT;
但这显然将所有其他国家排除在所有集团之外
我还尝试了一个数据步骤:
DATA WORK.OUTPUT;
SET WORK.INPUT;
BY GROUP COUNTRY;
IF NOT (STATUS = 'DELETED' AND COUNTRY NOT IN (COUNTRY WHERE STATUS = 'REMAINING')) THEN DELETE;
RUN;
但是语法不正确,因为我不知道正确的书写方法。试试这个:
proc sql;
select * from your_table
where status = 'deleted' and
catx("_",country,group) not in
(select catx("_",country,group) from your_table where status='remaining');
quit;
输出:
company_name | group | country | status
com3 | 1 | UK | deleted
com4 | 2 | ES | deleted
com5 | 2 | FR | deleted
com6 | 3 | RO | deleted
com7 | 3 | BG | deleted
试试这个:
proc sql;
select * from your_table
where status = 'deleted' and
catx("_",country,group) not in
(select catx("_",country,group) from your_table where status='remaining');
quit;
输出:
company_name | group | country | status
com3 | 1 | UK | deleted
com4 | 2 | ES | deleted
com5 | 2 | FR | deleted
com6 | 3 | RO | deleted
com7 | 3 | BG | deleted
你的解决方案表明你的思路是正确的 一个数据步骤解决方案是:
data want(drop = remain_list);
length remain_list $ 20;
do until(last.group);
set have;
by group;
if status = 'REMAINING' and not find(remain_list, country) then
remain_list = catx(' ', remain_list, country);
end;
do until(last.group);
set have;
by group;
if status = 'DELETED' and not find(remain_list, strip(country)) then
output;
end;
run;
你的解决方案表明你的思路是正确的 一个数据步骤解决方案是:
data want(drop = remain_list);
length remain_list $ 20;
do until(last.group);
set have;
by group;
if status = 'REMAINING' and not find(remain_list, country) then
remain_list = catx(' ', remain_list, country);
end;
do until(last.group);
set have;
by group;
if status = 'DELETED' and not find(remain_list, strip(country)) then
output;
end;
run;
目标中的第一条记录是否应该是
COM3
,而不是COM1
?请点击DomPazz。我改了。谢谢。“我可以使用PROC SQL或数据步骤”您尝试了哪一个?请共享代码和所有错误消息。^若要添加,StackOverflow不是一种代码编写服务。如果您不知道从何处开始,请将状态代码更改为可以排序的数字代码,以便最小值或最大值以及只有一个是您想要的。然后就很容易解决了。如果目标中的第一条记录是COM3
,而不是COM1
,请点击DomPazz。我改了。谢谢。“我可以使用PROC SQL或数据步骤”您尝试了哪一个?请共享代码和所有错误消息。^若要添加,StackOverflow不是一种代码编写服务。如果您不知道从何处开始,请将状态代码更改为可以排序的数字代码,以便最小值或最大值以及只有一个是您想要的。然后就很容易解决了。@reeza为什么我们需要将状态代码转换为数字?很乐意帮助:)@reeza为什么我们需要将状态代码转换为数字?很乐意帮助:)谢谢。我选择了procsql解决方案,因为它的代码更少。但是很高兴看到数据步骤的执行。谢谢。我选择了procsql解决方案,因为它的代码更少。但是看到数据步骤的执行也很棒。