如何在sql中对链表排序？_Sql_Sql Server_Linked List

如何在sql中对链表排序？

sql sql-server

如何在sql中对链表排序？,sql,sql-server,linked-list,Sql,Sql Server,Linked List,我已经实现了一个链表作为自引用数据库表： CREATE TABLE LinkedList( Id bigint NOT NULL, ParentId bigint NULL, SomeData nvarchar(50) NOT NULL) 其中Id是主键，ParentId是列表中上一个节点的Id。第一个节点的ParentId=NULL 现在我想从表中进行选择，将行按它们应该显示的相同顺序排序，作为列表中的节点例如：如果表中包含行 Id ParentId

我已经实现了一个链表作为自引用数据库表：

CREATE TABLE LinkedList(
    Id bigint NOT NULL,
    ParentId bigint NULL,
    SomeData nvarchar(50) NOT NULL)

其中Id是主键，ParentId是列表中上一个节点的Id。第一个节点的ParentId=NULL

现在我想从表中进行选择，将行按它们应该显示的相同顺序排序，作为列表中的节点

例如：如果表中包含行

Id      ParentId  SomeData
24971   NULL      0
38324   24971     1
60088   60089     3
60089   38324     2
61039   61497     5
61497   60088     4
109397  109831    7
109831  61039     6

然后使用标准对其进行排序，结果应为：

Id      ParentId  SomeData
24971   NULL      0
38324   24971     1
60089   38324     2
60088   60089     3
61497   60088     4
61039   61497     5
109831  61039     6
109397  109831    7

您应该使用SomeData列作为控件，因此请不要通过SomeData:-）欺骗doing ORDER在Oracle中：

SELECT Id, ParentId, SomeData
FROM (
  SELECT ll.*, level AS lvl
  FROM LinkedList ll
  START WITH
    ParentID IS NULL
  CONNECT BY
    ParentId = PRIOR Id
)
ORDER BY
  lvl

另外，使用

NULL

作为

ParentID

是一种不好的做法，因为它不能通过索引进行搜索。插入一个id为

或

-1

的代理根目录，然后使用

START with ParentID=0

我找到了一个SQLServer解决方案，但看起来比Quassnoi的大得多，也不那么优雅

WITH SortedList (Id, ParentId, SomeData, Level)
AS
(
  SELECT Id, ParentId, SomeData, 0 as Level
    FROM LinkedList
   WHERE ParentId IS NULL
  UNION ALL
  SELECT ll.Id, ll.ParentId, ll.SomeData, Level+1 as Level
    FROM LinkedList ll
   INNER JOIN SortedList as s
      ON ll.ParentId = s.Id
)

SELECT Id, ParentId, SomeData
  FROM SortedList
 ORDER BY Level

（编辑：噢！我调试的时候你也找到了！）

在SQL Server中：

;WITH cte (Id, ParentId, SomeData, [Level]) AS (
    SELECT Id, ParentId, SomeData, 0
    FROM LinkedList
    WHERE ParentId IS NULL
    UNION ALL
    SELECT ll.Id, ll.ParentId, ll.SomeData, cte.[Level] + 1
    FROM LinkedList ll
    INNER JOIN cte ON ll.ParentID = cte.ID
)
SELECT * FROM cte
ORDER BY [Level]

PostgreSQL版本

创建表、索引和数据：

DROP TABLE IF EXISTS LinkedList;

CREATE TABLE LinkedList (
    Id BIGINT NOT NULL,
    ParentId BIGINT NULL,
    SomeData VARCHAR(50)
);

CREATE INDEX LinkedList_Id_idx on LinkedList (Id);
CREATE index LinkedList_ParentId_idx on LinkedList (ParentId);

INSERT INTO LinkedList
    (Id, ParentId, SomeData)
VALUES 
    (24971,   NULL,      0),
    (38324,   24971,     1),
    (60088,   60089,     3),
    (60089,   38324,     2),
    (61039,   61497,     5),
    (61497,   60088,     4),
    (109397,  109831,    7),
    (109831,  61039,     6);

实际查询：

WITH RECURSIVE SortedList AS (
    SELECT
        *,
        0 AS SortKey
    FROM LinkedList
    WHERE ParentId IS NULL
    UNION ALL (
        SELECT
            LinkedList.*,
            SortedList.SortKey + 1 AS SortKey
        FROM LinkedList
        INNER JOIN SortedList
            ON (LinkedList.ParentId = SortedList.Id)
    )
)
SELECT
    *
FROM SortedList
ORDER BY SortKey;

结果:

   id   | parentid | somedata | sortkey
--------+----------+----------+---------
  24971 |          | 0        |       0
  38324 |    24971 | 1        |       1
  60089 |    38324 | 2        |       2
  60088 |    60089 | 3        |       3
  61497 |    60088 | 4        |       4
  61039 |    61497 | 5        |       5
 109831 |    61039 | 6        |       6
 109397 |   109831 | 7        |       7

Sort  (cost=6236.12..6300.16 rows=25616 width=138) (actual time=17857.640..17858.207 rows=10000 loops=1)
  Sort Key: sortedlist.sortkey
  Sort Method: quicksort  Memory: 1166kB
  CTE sortedlist
    ->  Recursive Union  (cost=4.40..2007.10 rows=25616 width=138) (actual time=0.032..17844.139 rows=10000 loops=1)
          ->  Bitmap Heap Scan on linkedlist  (cost=4.40..42.78 rows=16 width=138) (actual time=0.031..0.032 rows=1 loops=1)
                Recheck Cond: (parentid IS NULL)
                Heap Blocks: exact=1
                ->  Bitmap Index Scan on linkedlist_parentid_idx  (cost=0.00..4.40 rows=16 width=0) (actual time=0.006..0.006 rows=2 loops=1)
                      Index Cond: (parentid IS NULL)
          ->  Hash Join  (cost=5.20..145.20 rows=2560 width=138) (actual time=0.896..1.780 rows=1 loops=10000)
                Hash Cond: (linkedlist_1.parentid = sortedlist_1.id)
                ->  Seq Scan on linkedlist linkedlist_1  (cost=0.00..96.00 rows=3200 width=134) (actual time=0.002..0.784 rows=10000 loops=10000)
                ->  Hash  (cost=3.20..3.20 rows=160 width=12) (actual time=0.001..0.001 rows=1 loops=10000)
                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
                      ->  WorkTable Scan on sortedlist sortedlist_1  (cost=0.00..3.20 rows=160 width=12) (actual time=0.000..0.001 rows=1 loops=10000)
  ->  CTE Scan on sortedlist  (cost=0.00..512.32 rows=25616 width=138) (actual time=0.034..17851.344 rows=10000 loops=1)
Planning Time: 0.163 ms
Execution Time: 17858.957 ms

还制定了一些基准：

\set N 10000

DELETE FROM LinkedList;
INSERT INTO LinkedList VALUES (1, NULL, 1);
INSERT INTO LinkedList (
    SELECT
        generate_series AS Id,
        (generate_series - 1) AS ParentId,
        generate_series AS SomeData
    FROM GENERATE_SERIES(2, :N)
);

EXPLAIN ANALYZE
WITH RECURSIVE SortedList AS (
    SELECT
        *,
        0 AS SortKey
    FROM LinkedList
    WHERE ParentId IS NULL
    UNION ALL (
        SELECT
            LinkedList.*,
            SortedList.SortKey + 1 AS SortKey
        FROM LinkedList
        INNER JOIN SortedList
            ON (LinkedList.ParentId = SortedList.Id)
    )
)
SELECT
    *
FROM SortedList
ORDER BY SortKey;

结果:

   id   | parentid | somedata | sortkey
--------+----------+----------+---------
  24971 |          | 0        |       0
  38324 |    24971 | 1        |       1
  60089 |    38324 | 2        |       2
  60088 |    60089 | 3        |       3
  61497 |    60088 | 4        |       4
  61039 |    61497 | 5        |       5
 109831 |    61039 | 6        |       6
 109397 |   109831 | 7        |       7

Sort  (cost=6236.12..6300.16 rows=25616 width=138) (actual time=17857.640..17858.207 rows=10000 loops=1)
  Sort Key: sortedlist.sortkey
  Sort Method: quicksort  Memory: 1166kB
  CTE sortedlist
    ->  Recursive Union  (cost=4.40..2007.10 rows=25616 width=138) (actual time=0.032..17844.139 rows=10000 loops=1)
          ->  Bitmap Heap Scan on linkedlist  (cost=4.40..42.78 rows=16 width=138) (actual time=0.031..0.032 rows=1 loops=1)
                Recheck Cond: (parentid IS NULL)
                Heap Blocks: exact=1
                ->  Bitmap Index Scan on linkedlist_parentid_idx  (cost=0.00..4.40 rows=16 width=0) (actual time=0.006..0.006 rows=2 loops=1)
                      Index Cond: (parentid IS NULL)
          ->  Hash Join  (cost=5.20..145.20 rows=2560 width=138) (actual time=0.896..1.780 rows=1 loops=10000)
                Hash Cond: (linkedlist_1.parentid = sortedlist_1.id)
                ->  Seq Scan on linkedlist linkedlist_1  (cost=0.00..96.00 rows=3200 width=134) (actual time=0.002..0.784 rows=10000 loops=10000)
                ->  Hash  (cost=3.20..3.20 rows=160 width=12) (actual time=0.001..0.001 rows=1 loops=10000)
                      Buckets: 1024  Batches: 1  Memory Usage: 9kB
                      ->  WorkTable Scan on sortedlist sortedlist_1  (cost=0.00..3.20 rows=160 width=12) (actual time=0.000..0.001 rows=1 loops=10000)
  ->  CTE Scan on sortedlist  (cost=0.00..512.32 rows=25616 width=138) (actual time=0.034..17851.344 rows=10000 loops=1)
Planning Time: 0.163 ms
Execution Time: 17858.957 ms

所以这个查询非常慢。

这是一个非常好的答案。如何在SQLServer 2005中执行相同的操作？+1，但不是针对反空注释：使用0或-1会阻止使用外键来强制执行完整性。您可以始终保留代理根并将其保留在数据库中。如果表中包含许多记录，则对第一个条目的完全扫描速度将太慢。@Quassnoi：您不需要为SQL Server 2005及更高版本创建存储过程。它们支持公共表表达式（CTE），使您可以执行与CONNECT BY相同的操作。是的，CTE是在SQL Server中执行此操作的方法。请参阅下面我的解决方案。您的“请勿欺骗”注释：如果您选择的样本数据在单独排序时不会得到正确的结果，则会更好。那样的话，“作弊”就不是一个选项了。嗯。。。我想你在添加那个专栏时不理解我的意图。我只是想让那些可能想出答案的人生活得更轻松。称之为“测试工具”。实际上，SortedList的名称有点混乱-实际上没有排序-只是递归。。。您需要按最终选择排序（请参阅我的回复）很公平，我在最终选择中添加了列级别和ORDER BY。SQLite可以使用类似的方法吗？使用CTE有一个问题，即递归限制。默认情况下，此限制为100，但您可以通过在queryah末尾添加“option（maxrecursion 32767）”将其增加到32767-误读问题；我以为你想订一些数据；如果你想让它以链接的顺序排列，那么[Level]是一种方法来解释为什么分号在开头？@Greg，因为CTE很挑剔；你几乎总是需要领导；所以我不妨加上它。在这个例子上面加上任何东西，它就会爆炸