C# 快速集合比较_C#_Data Structures

C# 快速集合比较

c# data-structures

C# 快速集合比较,c#,data-structures,C#,Data Structures,我有以下数据类型： ISet<IEnumerable<Foo>> ISet<Seq> //where Seq : IEnumerable<Foo>, IEquatable<Seq> 但这不是（因为这里重复了“AB”）：但是，为了做到这一点，为了使“set”不包含重复项，我需要将我的IEnumerable包装为某种其他数据类型： ISet<IEnumerable<Foo>> ISet<Seq>

我有以下数据类型：

ISet<IEnumerable<Foo>>

ISet<Seq>
//where
Seq : IEnumerable<Foo>, IEquatable<Seq>

但这不是（因为这里重复了“AB”）：

但是，为了做到这一点，为了使“set”不包含重复项，我需要将我的

IEnumerable

包装为某种其他数据类型：

ISet<IEnumerable<Foo>>

ISet<Seq>
//where
Seq : IEnumerable<Foo>, IEquatable<Seq>

谢谢。

O（1）本质上意味着不允许您比较元素的值。如果您可以将序列表示为不可变对象的列表（添加缓存以便在所有实例中都没有重复项），您可以实现它，因为您只需要比较第一个元素-类似于字符串插入的工作方式

Insert必须用下一个“元素”搜索“当前”+“的所有元素实例。某种字典可能是合理的方法

编辑：我认为它只是试图想出一个方法。

我已经在下面为您提供了一个实现序列。有几点需要注意：

只有当

IEnumerable

每次枚举时都返回相同的项，并且这些项在该对象的作用域内不会发生变化时，此方法才有效

哈希代码被缓存。第一次请求它时，它会基于底层序列的完整迭代计算它（如果你知道更好的哈希代码算法，请随意改进）。因为它只需要计算一次，所以可以有效地认为是O（1）如果您经常计算，则添加到集合中可能会稍微慢一点（第一次计算哈希值），但搜索或删除会非常快

equals方法首先比较哈希代码。如果哈希代码不同，则对象不可能相等（如果哈希代码在序列中的所有对象上正确实现，并且没有任何变化）。只要您的冲突率较低，并且通常比较实际不相等的项，这意味着equals检查通常不会通过哈希代码检查。如果通过，则需要对序列进行迭代（无法避免）。因此，equals可能会平均O（1），即使它的最坏情况仍然是O（n）

公共类Foo:IEnumerable { 私有可数序列

private int? myHashCode = null;

public Foo(IEnumerable<T> sequence)
{
    this.sequence = sequence;
}

public IEnumerator<T> GetEnumerator()
{
    return sequence.GetEnumerator();
}

IEnumerator IEnumerable.GetEnumerator()
{
    return sequence.GetEnumerator();
}

public override bool Equals(object obj)
{
    Foo<T> other = obj as Foo<T>;
    if(other == null)
        return false;

    //if the hash codes are different we don't need to bother doing a deep equals check
    //the hash code is cached, so it's fast.
    if (GetHashCode() != obj.GetHashCode())
        return false;

    return Enumerable.SequenceEqual(sequence, other.sequence);
}

public override int GetHashCode()
{
    //note that the hash code is cached, so the underlying sequence 
    //needs to not change.
    return myHashCode ?? populateHashCode();
}

private int populateHashCode()
{
    int somePrimeNumber = 37;
    myHashCode = 1;
    foreach (T item in sequence)
    {
        myHashCode = (myHashCode * somePrimeNumber) + item.GetHashCode();
    }

    return myHashCode.Value;
}

private int？myHashCode=null；
公共Foo（IEnumerable序列）
{
这个序列=序列；
}
公共IEnumerator GetEnumerator（）
{
返回序列.GetEnumerator（）；
}
IEnumerator IEnumerable.GetEnumerator（）
{
返回序列.GetEnumerator（）；
}
公共覆盖布尔等于（对象对象对象）
{
Foo-other=obj作为Foo；
如果（其他==null）
返回false；
//如果散列码不同，我们就不需要费心做深度相等检查
//哈希代码是缓存的，所以速度很快。
if（GetHashCode（）！=obj.GetHashCode（））
返回false；
返回可枚举的.SequenceEqual（sequence，other.sequence）；
}
公共覆盖int GetHashCode（）
{
//请注意，哈希代码是缓存的，因此底层序列
//我们不需要改变。
返回myHashCode？？populateHashCode（）；
}
private int populateHashCode（）
{
int-somePrimeNumber=37；
myHashCode=1；
foreach（按顺序排列的T项）
{
myHashCode=（myHashCode*somePrimeNumber）+item.GetHashCode（）；
}
返回myHashCode.Value；
}

}

不。你也可以确定

Foo

实例正确地实现了相等。因此

Foo（“A”）.equals（Foo（“A”）

和

！equals（Foo（“B”）

。如果需要，我甚至可以为每个

Foo

实现一个相同的实例，也就是说，我可以做到

Foo（“A”）==Foo（“A”）

如果需要，通过保留所有创建实例的缓存…您的序列是可变的还是不可变的？因此IEnumerable的顺序对比较很重要？是的。将其视为包含唯一字符串的集合，其中每个字符串由字符组成。噢，哇，我明白您的意思。是的，这会起作用…我需要重新考虑我的不过内存需求。等等，我还是不明白。我知道我如何将

Foo（“A”）

与

Foo（“A”）

相同，但是我如何才能将

new Seq{Foo（“A”）、Foo（“B”）==new Seq{Foo（“A”）、Foo（“B”）与？Thanks@drozzy如果你注意到一个相当高的碰撞率，你可以考虑使用更大的哈希值。（即长而不是整数）作为内部散列。它不会减少散列集或字典中的冲突，但会导致Equals
对于不相等的项返回得更快（平均）。@drozzy，我想的想法是基于Seq（a，B，C）=a+Seq（B，C）
和类似Item
={Foo node；Item next；}。我不确定实际构建它是否会有合理的性能，但比较将是恒定的时间。谢谢，我考虑过类似的方法。除非有其他选择，否则我认为这就是我将要使用的方法。+1.将“不相等”设置为非常快对于实际目的来说可能已经足够了。
private int? myHashCode = null;

public Foo(IEnumerable<T> sequence)
{
    this.sequence = sequence;
}

public IEnumerator<T> GetEnumerator()
{
    return sequence.GetEnumerator();
}

IEnumerator IEnumerable.GetEnumerator()
{
    return sequence.GetEnumerator();
}

public override bool Equals(object obj)
{
    Foo<T> other = obj as Foo<T>;
    if(other == null)
        return false;

    //if the hash codes are different we don't need to bother doing a deep equals check
    //the hash code is cached, so it's fast.
    if (GetHashCode() != obj.GetHashCode())
        return false;

    return Enumerable.SequenceEqual(sequence, other.sequence);
}

public override int GetHashCode()
{
    //note that the hash code is cached, so the underlying sequence 
    //needs to not change.
    return myHashCode ?? populateHashCode();
}

private int populateHashCode()
{
    int somePrimeNumber = 37;
    myHashCode = 1;
    foreach (T item in sequence)
    {
        myHashCode = (myHashCode * somePrimeNumber) + item.GetHashCode();
    }

    return myHashCode.Value;
}