Sql 重复分析_Sql_Oracle_Plsql - Fatal编程技术网

Sql 重复分析

sql oracle plsql

Sql 重复分析,sql,oracle,plsql,Sql,Oracle,Plsql,我正试图用Oracle SQL编写一些逻辑，但很难做到正确。首先，我需要我的脚本来识别重复的项目。然后确定重复项的最新项。我正在使用的数据库在应用程序之外有大量的手动数据插入。当使用ID号时，这会导致项目出现顺序错误。我正在使用开始日期和ID号作为测量顺序的方法，因为表中没有其他方法可以这样做如果我需要确定员工12311的最新角色，我将如何确定以下是我到目前为止的情况：桌子代码我不想查看每个员工的所有记录并确定最近的记录，而是希望脚本只使用重复的开始日期基本上，如果最近的STARTD

我正试图用Oracle SQL编写一些逻辑，但很难做到正确。首先，我需要我的脚本来识别重复的项目。然后确定重复项的最新项。我正在使用的数据库在应用程序之外有大量的手动数据插入。当使用ID号时，这会导致项目出现顺序错误。我正在使用开始日期和ID号作为测量顺序的方法，因为表中没有其他方法可以这样做

如果我需要确定员工12311的最新角色，我将如何确定

以下是我到目前为止的情况：

桌子

代码

我不想查看每个员工的所有记录并确定最近的记录，而是希望脚本只使用重复的开始日期

基本上，如果最近的STARTDATE是重复的，那么确定哪个ID是最高的

所以它应该是这样的：

  ID | EMPLOYEE |       ROLE |   STARTDATE           | MAX Date | Max ID
-----|----------|------------|----------------------------------|--------
3432 |    12311 | Supervisor |  2016-07-12T00:00:00Z |        1 |      1
3421 |    12311 | Analyst    |  2016-07-12T00:00:00Z |        1 |      0
4321 |    12311 | Help Desk  |  2014-05-12T00:00:00Z |        0 |      0
5432 |    23432 | Manager    |  2012-11-02T00:00:00Z |        1 |      1
3452 |    23432 | Associate  |  2011-04-23T00:00:00Z |        0 |      0
7652 |    54332 | Analyst    |  2015-10-15T00:00:00Z |        1 |      1
5691 |    54332 | Assistant  |  2013-10-15T00:00:00Z |        0 |      0

我完全愿意接受更好的方法。如果您能提供任何帮助，我们将不胜感激

使用解决方案编辑：

感谢@Littlefoot的帮助。我可以修改我的脚本以包括以下内容：

   SELECT "ID", "EMPLOYEE", "ROLE", "STARTDATE",
    ROW_NUMBER() OVER (PARTITION BY "EMPLOYEE" ORDER BY "STARTDATE" DESC, "ID" DESC) RN
    FROM (
    SELECT DISTINCT EMPLOYEE "E.EMPLOYEE",
    E.ID "ID",
    LR.DESCRIPTION "ROLE", 
    ROLE_START_DATE "STARTDATE"
    FROM EMPLOYEES E
    JOIN ROLES R ON E.EMPLOYEE_ID = R.EMPLOYEE_ID
    JOIN LU_ROLES LR ON R.ROLE_ID = LR.ROLE_ID
    WHERE ROLE_START_DATE <= DATE '2017-12-03')
    ORDER BY 2

然后，我用RN=1筛选结果

如果我需要确定员工12311的最新角色，我将如何确定

RN最低的那个？当一个列本身执行作业时，为什么需要两个MAX列？例如：

SQL> with test (id, empid, role, startdate) as
  2    (select 3432, 12311, 'supervisor', date '2016-07-12' from dual union
  3     select 3421, 12311, 'analyst'   , date '2016-07-12' from dual union
  4     select 4321, 12311, 'help desk' , date '2014-05-12' from dual union
  5     --
  6     select 5432, 23432, 'manager'   , date '2012-11-02' from dual union
  7     select 3452, 23432, 'associate' , date '2011-04-23' from dual
  8    )
  9  select id, empid, role, startdate,
 10    row_number() over (partition by empid order by startdate desc, id desc) rn
 11  from test;

        ID      EMPID ROLE       STARTDATE          RN
---------- ---------- ---------- ---------- ----------
      3432      12311 supervisor 2016-07-12          1
      3421      12311 analyst    2016-07-12          2
      4321      12311 help desk  2014-05-12          3
      5432      23432 manager    2012-11-02          1
      3452      23432 associate  2011-04-23          2

SQL>

该查询将是另一个查询的源，该查询使用WHERE子句，即

  <snip>
  9  select id, empid, role, startdate
 10  from (select id, empid, role, startdate,
 11          row_number() over (partition by empid order by startdate desc, id desc) rn
 12        from test
 13       )
 14  where rn = 1;

        ID      EMPID ROLE       STARTDATE
---------- ---------- ---------- ----------
      3432      12311 supervisor 2016-07-12
      5432      23432 manager    2012-11-02

SQL>

您可以使用max aggregate with一步完成此操作；简化形式：

select employee,
  max(role) keep (dense_rank last order by startdate, id) as role
from employees
group by employee

这使用startdate和id查找“最新”角色；该id仅在startdate上有关联时才相关

在CTE中使用示例数据演示：

with employees (ID, EMPLOYEE, ROLE, STARTDATE) as (
            select 3432, 12311, 'Supervisor', timestamp '2016-07-12 00:00:00 UTC' from dual
  union all select 3421, 12311, 'Analyst', timestamp '2016-07-12 00:00:00 UTC' from dual
  union all select 4321, 12311, 'Help Desk', timestamp '2014-05-12 00:00:00 UTC' from dual
  union all select 5432, 23432, 'Manager', timestamp '2012-11-02 00:00:00 UTC' from dual
  union all select 3452, 23432, 'Associate', timestamp '2011-04-23 00:00:00 UTC' from dual
  union all select 7652, 54332, 'Analyst', timestamp '2015-10-15 00:00:00 UTC' from dual
  union all select 5691, 54332, 'Assistant', timestamp '2013-10-15 00:00:00 UTC' from dual
)
select employee,
  max(role) keep (dense_rank last order by startdate, id) as role
from employees
group by employee
order by employee;

  EMPLOYEE ROLE      
---------- ----------
     12311 Supervisor
     23432 Manager   
     54332 Analyst

您可以对联接的表使用相同的函数，而无需手动计算排名。

我将使用keep：

谢谢你！我需要它更具可伸缩性，所以我从嵌套的select语句中提取了它。到目前为止，它似乎正在发挥作用。我喜欢这个，因为它不是简单的1，0-它的排名。从SELECT DISTINCT EMPLOYEE E.EMPLOYEE中选择ID、empid order、STARTDATE desc、ID desc rn的ID、EMPLOYEE、角色、STARTDATE、分区上的行号。。。。。我还在尝试其他一些方法，看看哪一种效率最高。但这太棒了！再次感谢！不客气；如果这有帮助，我很高兴。如果我可以建议的话：摆脱使用双引号命名Oracle对象和列的坏习惯。它只会带来问题。默认情况下，它们都是以大写字母创建的，但您可以以任何方式引用它们。但是，如果用双引号括起来，在创建这些对象时，您必须始终遵循使用的小写/大写/混合大小写。谢谢你的提示！

SQL> with test (id, empid, role, startdate) as
  2    (select 3432, 12311, 'supervisor', date '2016-07-12' from dual union
  3     select 3421, 12311, 'analyst'   , date '2016-07-12' from dual union
  4     select 4321, 12311, 'help desk' , date '2014-05-12' from dual union
  5     --
  6     select 5432, 23432, 'manager'   , date '2012-11-02' from dual union
  7     select 3452, 23432, 'associate' , date '2011-04-23' from dual
  8    )
  9  select id, empid, role, startdate,
 10    row_number() over (partition by empid order by startdate desc, id desc) rn
 11  from test;

        ID      EMPID ROLE       STARTDATE          RN
---------- ---------- ---------- ---------- ----------
      3432      12311 supervisor 2016-07-12          1
      3421      12311 analyst    2016-07-12          2
      4321      12311 help desk  2014-05-12          3
      5432      23432 manager    2012-11-02          1
      3452      23432 associate  2011-04-23          2

SQL>

  <snip>
  9  select id, empid, role, startdate
 10  from (select id, empid, role, startdate,
 11          row_number() over (partition by empid order by startdate desc, id desc) rn
 12        from test
 13       )
 14  where rn = 1;

        ID      EMPID ROLE       STARTDATE
---------- ---------- ---------- ----------
      3432      12311 supervisor 2016-07-12
      5432      23432 manager    2012-11-02

SQL>

select employee,
  max(role) keep (dense_rank last order by startdate, id) as role
from employees
group by employee

with employees (ID, EMPLOYEE, ROLE, STARTDATE) as (
            select 3432, 12311, 'Supervisor', timestamp '2016-07-12 00:00:00 UTC' from dual
  union all select 3421, 12311, 'Analyst', timestamp '2016-07-12 00:00:00 UTC' from dual
  union all select 4321, 12311, 'Help Desk', timestamp '2014-05-12 00:00:00 UTC' from dual
  union all select 5432, 23432, 'Manager', timestamp '2012-11-02 00:00:00 UTC' from dual
  union all select 3452, 23432, 'Associate', timestamp '2011-04-23 00:00:00 UTC' from dual
  union all select 7652, 54332, 'Analyst', timestamp '2015-10-15 00:00:00 UTC' from dual
  union all select 5691, 54332, 'Assistant', timestamp '2013-10-15 00:00:00 UTC' from dual
)
select employee,
  max(role) keep (dense_rank last order by startdate, id) as role
from employees
group by employee
order by employee;

  EMPLOYEE ROLE      
---------- ----------
     12311 Supervisor
     23432 Manager   
     54332 Analyst

SELECT EMPLOYEE as "E.EMPLOYEE",
       E.ID as "ID",
       MAX(LR.DESCRIPTION) KEEP (DENSE_RANK FIRST ORDER BY ROLE_START_DATE DESC) as "ROLE", 
       MAX(ROLE_START_DATE) as "STARTDATE"
FROM EMPLOYEES E JOIN
     ROLES R
     ON E.EMPLOYEE_ID = R.EMPLOYEE_ID JOIN
     LU_ROLES LR
     ON R.ROLE_ID = LR.ROLE_ID
WHERE ROLE_START_DATE <= DATE '2017-12-03'
GROUP BY EMPLOYEE;