SAS计数并总结变量上的条件是否为真

SAS计数并总结变量上的条件是否为真,sas,Sas,我有一个包含3列的数据集:Name、System、UserID。我想计算一个人在报告中出现的次数,但如果他们是同名的不同人员,则不计算他们的次数。区分是通过UserID字段进行的,并且仅在单个系统中进行。如果一个名称有多行具有相同的系统和不同的用户ID,那么所有具有该名称的观察结果都将被标记以供审查。对于这个数据集,我希望看到下面的输出 Name System UserID John Doe Sys1 [blank] John Doe Sys1 AB123

我有一个包含3列的数据集:Name、System、UserID。我想计算一个人在报告中出现的次数,但如果他们是同名的不同人员,则不计算他们的次数。区分是通过UserID字段进行的,并且仅在单个系统中进行。如果一个名称有多行具有相同的系统和不同的用户ID,那么所有具有该名称的观察结果都将被标记以供审查。对于这个数据集,我希望看到下面的输出

Name       System   UserID
John Doe   Sys1     [blank]
John Doe   Sys1     AB1234
John Doe   Sys2     AB2345
Jane Doe   Sys1     AA2345
Jane Doe   Sys1     AA23456
Jane Doe   Sys2     AA2345
Joe Smith  Sys1     JS963
Joe Smith  Sys2     JS741


Name       Count  System                      Follow-up
John Doe   1      Sys1 -                      Yes
John Doe   1      Sys1 - AB1234               Yes
John Doe   1      Sys2 - AB2345               Yes
Jane Doe   1      Sys1 - AA2345               Yes
Jane Doe   1      Sys1 - AA23456              Yes
Jane Doe   1      Sys2 - AA2345               Yes
Joe Smith  2      Sys1 - JS963, Sys2 - JS741  No
任何帮助都将不胜感激

下面是我的代码。它目前只是对一些名字进行了统计,不知道如何添加条件

PROC SQL;

     CREATE TABLE Sorted_Master_Original AS

     SELECT Name,
            COUNT(Name) AS Total,
            System,
            UserID,
            CATX(' - ',System,UserID) AS SystemID

     FROM Master_Original

     WHERE Name <> ""

     GROUP BY Name;

QUIT;

DATA TESTDATA.Final_Listing;

LENGTH SystemsAccessed $200.;

   DO UNTIL (last.Name);

   SET Sorted_Master_Original;

   BY Name NOTSORTED;

   SystemsAccessed=CATX(', ',SystemsAccessed,SystemID);

END;

DROP System SystemID;

RUN;

在一个组中确定一个信号,然后应用到该组中的每个成员的情况,可以使用两个顺序的DOW循环来完成。第一个是用最后一个编码的。循环测试和循环内的set和by,第二个将用于通过在单独的set缓冲器中的组上的类似大小的循环来重复组-此时可以应用信号

资料

按组处理的订单

proc sort data=have;
  by Name System UserId;
run;
具有顺序DOW循环的数据步

data want(keep=name count system_userid_list followup);
  * loop over name group;
  do _n_ = 1 by 1 until (last.name);
    set have;
    by name system userid;

    * tests of conditions within the group determine some signal;

    * check if there is more than one userid within a system within the name group;
    if not (first.system and last.system) then
      count = 1;
  end;

  if not count then count = _n_;

  length system_userid_list $200;
  followup = ifc(count=1 and _n_>1 ,'Yes','No');

  * followup 'signal' will be applied/available to each row of the group;

  * either output single row as followup, or concat to an aggregate list;
  * mixed output of singles and aggregates, good idea?;

  * reiterate over group in second SET buffer;
  do _n_ = 1 to _n_;
    set have;
    item = catx(' - ',system,userid);
    if count > 1 then
      system_userid_list = catx(',',system_userid_list,item);
    else do;
      system_userid_list = item;
      output;
    end;
  end;

  if count > 1 then output;
run;

请包括您迄今为止试图解决此问题的任何内容。为什么Joe Smith被合并,但记录2/3中的John Doe没有以类似的方式呈现?更新了帖子,以包含我迄今为止的编码。Joe Smith的两个观察结果是针对不同的系统的,因此将其结合起来。John Doe对Sys1有2个观察,但用户ID字段不匹配,因此这些行被标记为后续。此外,如果有一个实例,其中一个名称对同一系统有多个用户ID,则所有具有该名称的观察都需要标记为后续。这并不能解释差异,它们都有两个1/2的系统,每个系统中都有不同的ID。是因为JohnDoe在System 1中有两个ID不匹配吗?
data want(keep=name count system_userid_list followup);
  * loop over name group;
  do _n_ = 1 by 1 until (last.name);
    set have;
    by name system userid;

    * tests of conditions within the group determine some signal;

    * check if there is more than one userid within a system within the name group;
    if not (first.system and last.system) then
      count = 1;
  end;

  if not count then count = _n_;

  length system_userid_list $200;
  followup = ifc(count=1 and _n_>1 ,'Yes','No');

  * followup 'signal' will be applied/available to each row of the group;

  * either output single row as followup, or concat to an aggregate list;
  * mixed output of singles and aggregates, good idea?;

  * reiterate over group in second SET buffer;
  do _n_ = 1 to _n_;
    set have;
    item = catx(' - ',system,userid);
    if count > 1 then
      system_userid_list = catx(',',system_userid_list,item);
    else do;
      system_userid_list = item;
      output;
    end;
  end;

  if count > 1 then output;
run;