C# 寻找两个列表交集的高效数据结构_C#_Algorithm_Data Structures

C# 寻找两个列表交集的高效数据结构

c# algorithm data-structures

C# 寻找两个列表交集的高效数据结构,c#,algorithm,data-structures,C#,Algorithm,Data Structures,我有两个非常大的列表A和B。我需要在这些列表的每个元素之间找到交集 A[0] = { 1, 2, 3}; B[0] = {2, 3, 4}; Intersection = { 2, 3 }; 我的实施： List<int> intersection = A[0].Intersection(B[0]).ToList(); List intersection=A[0]。intersection（B[0]）.ToList（）；此解决方案需要很长时间才能执行。我想知道是否有更好的方

我有两个非常大的

列表

A和B。我需要在这些列表的每个元素之间找到交集

A[0] = { 1, 2, 3};
B[0] = {2, 3, 4};

Intersection = { 2, 3 };

我的实施：

List<int> intersection = A[0].Intersection(B[0]).ToList();

List intersection=A[0]。intersection（B[0]）.ToList（）；

此解决方案需要很长时间才能执行。我想知道是否有更好的方法来做到这一点，是否有更有效的数据结构，我可以使用它在更好的时间执行

谢谢

您应该为此使用哈希集，用C#表示。哈希集中的查找是O（1）（如果使用适当的哈希函数并在下面使用数组），而不是列表中的O（n）

在C#中使用Linq基本上是“内置的”：如果使用两个列表，将在内部使用哈希集来计算O（n）中的交集，而不是O（n^2）

var intersection = a.Intersect(b).ToList();

代码示例使用：

HashSet lst1=新的HashSet
{“id1”、“id2”、“id3”}；
HashSet lst2=新HashSet
{“id2”、“id3”、“id4”}；
//发生的情况是，lst1将只通过保留相交项进行修改
lst1.与（lst2）相交；

PS：我将示例用于字符串，但您可以使用自己的整数值。

太好了！谢谢关于

.ToList（）

它是否会产生额外的O（N）复杂性，我如何避免它？这可能是最好的解决方案，但值得记住的是，O（1）的某些东西仍然比O（N）的东西慢。这些特征不指定完成搜索的时间，而是指定随着集合大小的增加，该时间如何变化。昂贵的散列计算可能会执行O（n）简单比较。这是一个几乎被普遍忽视的原因，哈希表相对于二进制搜索几乎从来没有达到人们的期望。好问题thang-这将使事情变得更容易-只有一个集合需要排序才能使用二进制搜索。挑一个。OP的问题完全类似于关系数据库中的外键查找。未指定的是是否允许A或B具有重复值，以及如果它们具有重复值，将发生什么。

HashSet<string> lst1 = new HashSet<string> 

     { "id1", "id2", "id3" };

HashSet<string> lst2 = new HashSet<string> 

     { "id2", "id3", "id4" };

// what happens is that, lst1 will be modified by only leaving the intersect items
lst1.IntersectWith(lst2);