Sql server 有向图中相似圈的消除

Sql server 有向图中相似圈的消除,sql-server,tsql,sql-server-2012,Sql Server,Tsql,Sql Server 2012,我有以下数据: 表: CREATE TABLE tblLoop ( person1 varchar(20), person2 varchar(20), ColDate date, ); INSERT INTO tblLoop VALUES('A','B','2020-01-01'),('A','C','2020-01-01'),('A','D','2020-01-01'), ('B','E','2020-01-02')

我有以下数据:

表:

CREATE TABLE tblLoop
(
    person1 varchar(20),
    person2 varchar(20),
    ColDate date,
);

INSERT INTO tblLoop VALUES('A','B','2020-01-01'),('A','C','2020-01-01'),('A','D','2020-01-01'),
                          ('B','E','2020-01-02'),('B','F','2020-01-02'),
                          ('D','G','2020-01-03'),('D','H','2020-01-03'),
                          ('F','i','2020-01-04'),
                          ('G','J','2020-01-05'),
                          ('i','A','2020-01-06'),
                          ('J','D','2020-01-07'),
                          ('X','Y','2020-01-08'),('X','Z','2020-01-08'),
                          ('Z','X','2020-01-09'),
                          ('Y','W','2020-01-09');   
记录看起来像:

要求:我需要找到形成一个循环的人。例如,在给定数据中,我们发现了3个周期:

循环1:A连接B连接F连接i连接A

循环2:A连接D连接G连接J连接D

循环3:X与Z连接,Z与X连接

预期结果:

LoopFound
--------------------
A->B->F->i->A
A->D->G->J->D
X->Z->X
我的尝试:

;WITH CTE AS 
(
      SELECT Person1, Person2, 
             CONVERT(VARCHAR(MAX), (','+ Person1+ ','+ Person2+ ',')) AS nodes, 1 AS lev, 
             (CASE WHEN Person1 = Person2 THEN 1 ELSE 0 END) AS has_cycle
      FROM tblLoop e
      UNION ALL
      SELECT cte.Person1, e.Person2,
             CONVERT(VARCHAR(MAX), (cte.nodes+ e.Person2+ ',')), lev + 1,
             (CASE WHEN cte.nodes LIKE ('%,'+ e.Person2+ ',%') THEN 1 ELSE 0 END) AS has_cycle
      FROM CTE 
      JOIN tblLoop e ON e.Person1 = cte.Person2
      WHERE cte.has_cycle = 0 
     )
SELECT *
FROM CTE
WHERE has_cycle = 1;

注意:从上述查询中获得多个循环组合。

根据Kevin的评论,答案在于包含一个标志,指示给定节点是有效的起点,并且可以包含在查询中:

在本例中,“A”和“X”作为起点,因此我们将标记发起者为“A”或“X”的所有记录:

CREATE TABLE #tblLoop
(
    person1 varchar(20),
    person2 varchar(20),
    ColDate date,
isRoot INT
);
INSERT INTO #tblLoop VALUES('A','B','2020-01-01',1),
                          ('A','C','2020-01-01',1),
                          ('A','D','2020-01-01',1),
                          ('B','E','2020-01-02',0),
                          ('B','F','2020-01-02',0),
                          ('D','G','2020-01-03',0),
                          ('D','H','2020-01-03',0),
                          ('F','i','2020-01-04',0),
                          ('G','J','2020-01-05',0),
                          ('i','A','2020-01-06',0),
                          ('J','D','2020-01-07',0),
                          ('X','Y','2020-01-08',1),
                          ('X','Z','2020-01-08',1),
                          ('Z','X','2020-01-09',0),
                          ('Y','W','2020-01-09',0);   
然后可以将以下查询修改为:

;WITH CTE AS 
(
      SELECT Person1, Person2, isRoot,
             CONVERT(VARCHAR(MAX), (','+ Person1+ ','+ Person2+ ',')) AS nodes, 1 AS lev, 
             (CASE WHEN Person1 = Person2 THEN 1 ELSE 0 END) AS has_cycle
      FROM #tblLoop e
      UNION ALL
      SELECT cte.Person1, e.Person2, cte.isRoot,
             CONVERT(VARCHAR(MAX), (cte.nodes+ e.Person2+ ',')), lev + 1,
             (CASE WHEN cte.nodes LIKE ('%,'+ e.Person2+ ',%') THEN 1 ELSE 0 END) AS has_cycle
      FROM CTE 
      JOIN #tblLoop e ON e.Person1 = cte.Person2
      WHERE cte.has_cycle = 0 
     )
SELECT *
FROM CTE
WHERE has_cycle = 1
AND isRoot = 1

这要归功于Kevin的想法,这只是它的一个工作实现。

根据Kevin的评论,答案在于包含一个标志,指示给定节点是有效的起点,并且可以包含在查询中:

在本例中,“A”和“X”作为起点,因此我们将标记发起者为“A”或“X”的所有记录:

CREATE TABLE #tblLoop
(
    person1 varchar(20),
    person2 varchar(20),
    ColDate date,
isRoot INT
);
INSERT INTO #tblLoop VALUES('A','B','2020-01-01',1),
                          ('A','C','2020-01-01',1),
                          ('A','D','2020-01-01',1),
                          ('B','E','2020-01-02',0),
                          ('B','F','2020-01-02',0),
                          ('D','G','2020-01-03',0),
                          ('D','H','2020-01-03',0),
                          ('F','i','2020-01-04',0),
                          ('G','J','2020-01-05',0),
                          ('i','A','2020-01-06',0),
                          ('J','D','2020-01-07',0),
                          ('X','Y','2020-01-08',1),
                          ('X','Z','2020-01-08',1),
                          ('Z','X','2020-01-09',0),
                          ('Y','W','2020-01-09',0);   
然后可以将以下查询修改为:

;WITH CTE AS 
(
      SELECT Person1, Person2, isRoot,
             CONVERT(VARCHAR(MAX), (','+ Person1+ ','+ Person2+ ',')) AS nodes, 1 AS lev, 
             (CASE WHEN Person1 = Person2 THEN 1 ELSE 0 END) AS has_cycle
      FROM #tblLoop e
      UNION ALL
      SELECT cte.Person1, e.Person2, cte.isRoot,
             CONVERT(VARCHAR(MAX), (cte.nodes+ e.Person2+ ',')), lev + 1,
             (CASE WHEN cte.nodes LIKE ('%,'+ e.Person2+ ',%') THEN 1 ELSE 0 END) AS has_cycle
      FROM CTE 
      JOIN #tblLoop e ON e.Person1 = cte.Person2
      WHERE cte.has_cycle = 0 
     )
SELECT *
FROM CTE
WHERE has_cycle = 1
AND isRoot = 1

这要归功于Kevin的想法,这只是它的一个有效实现。

我尝试了以下两个步骤:

步骤1:在这一步中,找到图的开始节点

--Create table to store start nodes
IF OBJECT_ID('dbo.Temp_tblLoop', 'U') IS NOT NULL 
BEGIN
  DROP TABLE dbo.Temp_tblLoop; 
END

--Query to find start nodes.
;WITH CTE AS
(
    SELECT t.* 
    FROM tblLoop t
    WHERE person1 IN (SELECT person2 FROM tblLoop t2 WHERE t.ColDate<= t2.ColDate) OR
          person2 IN (SELECT person1 FROM tblLoop t3 WHERE t.ColDate<= t3.ColDate)
)
SELECT DISTINCT person1 INTO Temp_tblLoop
FROM CTE 
WHERE person1 NOT IN (SELECT person2 FROM CTE);

如果有什么需要改进的地方,请告诉我您的评论。

我尝试了以下两个步骤:

步骤1:在这一步中,找到图的开始节点

--Create table to store start nodes
IF OBJECT_ID('dbo.Temp_tblLoop', 'U') IS NOT NULL 
BEGIN
  DROP TABLE dbo.Temp_tblLoop; 
END

--Query to find start nodes.
;WITH CTE AS
(
    SELECT t.* 
    FROM tblLoop t
    WHERE person1 IN (SELECT person2 FROM tblLoop t2 WHERE t.ColDate<= t2.ColDate) OR
          person2 IN (SELECT person1 FROM tblLoop t3 WHERE t.ColDate<= t3.ColDate)
)
SELECT DISTINCT person1 INTO Temp_tblLoop
FROM CTE 
WHERE person1 NOT IN (SELECT person2 FROM CTE);

如果需要改进,请让我查看您的评论。

如果可能,我建议在您的表中添加一个额外的标志,表明人员A和X是根人员。在SELECT语句中,您只能筛选这些人的组合。如果可能的话,我建议在您的表中添加一个额外的标志,指示人A和X是根人。在SELECT语句中,您只能筛选这些人的组合。除了设置标志,还有其他方法吗?我尝试使用查询查找这些起始节点;将CTE设置为SELECT t.*FROM tblLoop t WHERE person1 IN SELECT person2 FROM tblLoop t2 WHERE t.ColDate除了设置标志外,还有其他方法吗?我尝试使用查询来查找这些起始节点;将CTE设置为从tblLoop t中选择t.*,其中人员1在tblLoop t2中选择人员2,其中t.ColDate