C# 为什么HashSet的时间复杂度<；T>；。是否依赖于相等比较器？_C#_Algorithm_Time Complexity_Hashset_Set Intersection

C# 为什么HashSet的时间复杂度<；T>；。是否依赖于相等比较器？

c# algorithm time-complexity

C# 为什么HashSet的时间复杂度<；T>；。是否依赖于相等比较器？,c#,algorithm,time-complexity,hashset,set-intersection,C#,Algorithm,Time Complexity,Hashset,Set Intersection,我指的是MSDN（）：备注: 如果由另一个参数表示的集合是哈希集与当前哈希集具有相同相等比较器的集合对象，此方法是一个O（n）操作。否则，该方法是一种有效的方法 O（n+m）运算，其中n为计数，m为元素数换句话说我试图理解平等比较器所起的作用如果other也是一个HashSet，则交叉点可以这样工作： T[] array = this.ToArray(); // O(1) foreach (T item in array) // iterate through this =>

我指的是MSDN（）：

备注:

如果由另一个参数表示的集合是哈希集与当前哈希集具有相同相等比较器的集合对象，此方法是一个O（n）操作。否则，该方法是一种有效的方法 O（n+m）运算，其中n为计数，m为元素数换句话说

我试图理解平等比较器所起的作用

如果

other

也是一个

HashSet

，则交叉点可以这样工作：

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)
        this.Remove(item); // this is a HashSet => O(1)

HashSet<T> intersectionSet = new HashSet<T>(this.Comparer); // O(1)
foreach (T item in other) // iterate through other => O(m)
    if (this.Contains(item)) // this is a HashSet => O(1)
        intersectionSet.Add(item); // intersectionSet is a HashSet => O(1)
this.Clear(); // O(n)
foreach (T item in intersectionSet) // O(m) in the worst case, because intersectionSet can have at most m elements
    this.Add(item); // O(1)

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)  A
        this.Remove(item); // this is a HashSet => O(1)       B

如MSDN所述，这使得总的

O（n）

。但据我所知，如果

other

是一个

HashSet

，那么它应该始终是

O（n）

——不管它有什么等式比较器

如果

other

不是

HashSet

，则上述代码片段中包含的

other.Contains

的复杂性会更大（例如

O（log m）

用于

SortedSet

或

O（m）

用于

列表

）。因为我们有嵌套操作，所以必须将数字相乘（对于

SortedSet

或

O（n*m）

对于

List

），以获得总复杂度，这比规定的

O（n+m）

要差。因此，对于

other

不是

HashSet

的情况，方法似乎有所不同

也许是这样做的：

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)
        this.Remove(item); // this is a HashSet => O(1)

HashSet<T> intersectionSet = new HashSet<T>(this.Comparer); // O(1)
foreach (T item in other) // iterate through other => O(m)
    if (this.Contains(item)) // this is a HashSet => O(1)
        intersectionSet.Add(item); // intersectionSet is a HashSet => O(1)
this.Clear(); // O(n)
foreach (T item in intersectionSet) // O(m) in the worst case, because intersectionSet can have at most m elements
    this.Add(item); // O(1)

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)  A
        this.Remove(item); // this is a HashSet => O(1)       B

HashSet intersectionSet=新的HashSet（this.Comparer）；//O（1）
foreach（other中的T项）//遍历other=>O（m）
if（this.Contains（item））//这是一个HashSet=>O（1）
相交集。添加（项目）；//intersectionSet是一个HashSet=>O（1）
这个。清除（）；//O（n）
foreach（intersectionSet中的T项）//O（m）在最坏的情况下，因为intersectionSet最多可以有m个元素
此项。添加（项）；//O（1）

所以我们得到了MSDN所说的

O（m+n）

。同样，我看不出等式比较器在复杂性中扮演什么角色

由于微软在设计/实现与相交的时投入了大量的思想和人力，我相信他们的版本（等式比较器在时间复杂度方面发挥了作用）是最好的。所以我想我在推理上犯了一些错误。你能给我指一下吗？

如果other也是一个

哈希集

，则交叉点的工作方式如下：

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)
        this.Remove(item); // this is a HashSet => O(1)

HashSet<T> intersectionSet = new HashSet<T>(this.Comparer); // O(1)
foreach (T item in other) // iterate through other => O(m)
    if (this.Contains(item)) // this is a HashSet => O(1)
        intersectionSet.Add(item); // intersectionSet is a HashSet => O(1)
this.Clear(); // O(n)
foreach (T item in intersectionSet) // O(m) in the worst case, because intersectionSet can have at most m elements
    this.Add(item); // O(1)

T[] array = this.ToArray(); // O(1)
foreach (T item in array) // iterate through this => O(n)
    if (!other.Contains(item)) // other is a HashSet => O(1)  A
        this.Remove(item); // this is a HashSet => O(1)       B

如果两个哈希集使用不同的相等比较器，则这将是一个不正确的实现。我用A标记的行将使用

其他

的相等比较器，我用

B标记的行将使用此
的相等比较器。因此，行other.Contains（item）
检查错误的东西：它检查other
是否认为它包含item
。它应该检查的是此
是否认为其他
包含项


但是除了数组的创建（不是O（1），微软可以通过使用HashSet的私有字段来避免它），你能想到的几乎就是微软在等式比较器匹配的情况下实际做的事情。
语句：foreach（）[O（m）]*other.Contains（item）[O（n+m）]。logm是当你有一个二叉树，它是一个散列。