C# 为什么hashset.except的迭代和检查速度是hashset.except的两倍！在另一个集合中包含？_C#_.net_Performance_Optimization

C# 为什么hashset.except的迭代和检查速度是hashset.except的两倍！在另一个集合中包含？

c# .net performance optimization

C# 为什么hashset.except的迭代和检查速度是hashset.except的两倍！在另一个集合中包含？,c#,.net,performance,optimization,C#,.net,Performance,Optimization,我只是做了一些优化，对此感到困惑我的原始代码如下所示： HashSet<IExampleAble> alreadyProcessed;//a few million items void someSetRoutineSlower(HashSet<IExampleAble> exampleSet) { foreach (var item in exampleSet) { if (!alread

我只是做了一些优化，对此感到困惑

我的原始代码如下所示：

   HashSet<IExampleAble> alreadyProcessed;//a few million items
    void someSetRoutineSlower(HashSet<IExampleAble> exampleSet)
    {

        foreach (var item in exampleSet)
        {
            if (!alreadyProcessed.Contains(item))
            {
                // do Stuff
            }
        }
    }

public void ExceptWith(IEnumerable<T> other) {
        if (other == null) {
            throw new ArgumentNullException("other");
        }
        Contract.EndContractBlock();

        // this is already the enpty set; return
        if (m_count == 0) {
            return;
        }

        // special case if other is this; a set minus itself is the empty set
        if (other == this) {
            Clear();
            return;
        }

        // remove every element in other from this
        foreach (T element in other) {
            Remove(element);
        }
    }

HashSet已处理//几百万件
void someSetRoutineSleer（HashSet示例集）
{
foreach（exampleSet中的变量项）
{
如果（！alreadyProcessed.Contains（项））
{
//做事
}
}
}

这大约需要120万个蜱虫来处理

然后我尝试了与exceptwith相同的方法：

 void someSetRoutineFaster(HashSet<IExampleAble> exampleSet)
    {
        exampleSet.ExceptWith(alreadyProcessed);//doesnt this have to check each item of it's collection against the other one, thus actually looping twice?
        foreach (var item in exampleSet)
        {
            // do Stuff
        }
    }

void someSetRoutineQuicker（HashSet示例集）
{
exampleSet.ExceptWith（alreadyProcessed）；//这难道不需要将其集合中的每个项与另一项进行检查，从而实际循环两次吗？
foreach（exampleSet中的变量项）
{
//做事
}
}

它的运行速度约为0.4-0.7英里

除此之外，还有什么样的优化？它是否也必须像我在第一个代码段中所做的那样检查所有项目？

根据.NET Framework 4.7.2中ExceptWith方法的参考源，如下所示：

   HashSet<IExampleAble> alreadyProcessed;//a few million items
    void someSetRoutineSlower(HashSet<IExampleAble> exampleSet)
    {

        foreach (var item in exampleSet)
        {
            if (!alreadyProcessed.Contains(item))
            {
                // do Stuff
            }
        }
    }

public void ExceptWith(IEnumerable<T> other) {
        if (other == null) {
            throw new ArgumentNullException("other");
        }
        Contract.EndContractBlock();

        // this is already the enpty set; return
        if (m_count == 0) {
            return;
        }

        // special case if other is this; a set minus itself is the empty set
        if (other == this) {
            Clear();
            return;
        }

        // remove every element in other from this
        foreach (T element in other) {
            Remove(element);
        }
    }

public void ExceptWith（IEnumerable other）{
如果（其他==null）{
抛出新的异常（“其他”）；
}
Contract.EndContractBlock（）；
//这已经是权限集；返回
如果（m_计数==0）{
返回；
}
//如果另一个是这样的特殊情况；一个集减去它本身就是空集
如果（其他==此）{
清除（）；
返回；
}
//从中删除“其他”中的所有元素
foreach（其他中的T元素）{
移除（元素）；
}
}

该方法中只有显式优化适用于集合为空或自身为“例外”的特殊情况

当Contains（T）调用的数量与设置的大小相当时，调用Contains（T）和迭代所有元素之间的差异可能会提高您的速度。从表面上看，它似乎应该显式地执行相同的、称为Contains（T）的旧实现，而新实现在Remove（T）中执行相同类型的搜索。不同之处在于，当元素被移除时，集合的内部结构变得更加稀疏。这会导致每个bucket的数据项（根据源代码标记的插槽）在统计上减少，并且查找元素的速度会更快，如果元素存在，那么它就是bucket中的第一项

这完全取决于对象的哈希函数的质量。理想情况下，每个对象在它的存储桶中都应该是单独的，但大多数真正的哈希函数都会分布数百万个元素，并且会发生冲突（同一个存储桶中有多个元素）。

@harold发布了看起来正确的答案，但他出于某种原因删除了它

ExceptWith（）

从集合中删除项目，因此，对于每个删除的元素，它在搜索下一个元素时可能会稍微快一点。使用

.Contains（）

集合永远不会变小，因此每个元素的搜索时间不会减少。@MatthewWatson，但exceptwith会迭代整个其他集合。那个比exampleset大很多。我的第一个想法是避免迭代“alreadyprocessed”，只需在迭代一次exampleset时执行containcheck。使用exampleset正是我试图避免的，但事实并非如此faster@MatthewWatson我刚试了一下，速度和我的一样快。仍然不明白为什么您可以共享您的性能测试？请发布包含简单基准的可执行代码。此版本是否为未连接调试器的x64？如前所述，结果是不可能的，因为已经处理的数据要大得多。因此，基准在某种程度上是错误的。