C# 从数组中查找相等或最近的较小值_C#_Performance

C# 从数组中查找相等或最近的较小值

c# performance

C# 从数组中查找相等或最近的较小值,c#,performance,C#,Performance,假设我有一个数组（实际上是255长，最大值为int.MaxValue）：从这个数组中，我想得到一个与我的数字相等或更小的值的索引 number = 7 -> index = 4 number = 2 -> index = 9 number = 8 -> index = 7 number = 9 -> index = 1 找到它的最快方法是什么到目前为止，我已经使用了线性搜索，但对于我的需要来说，这太低效了，因为即使这个数组只有255个长度，值也会被搜索几百万次我需

假设我有一个数组（实际上是255长，最大值为int.MaxValue）：

从这个数组中，我想得到一个与我的数字相等或更小的值的索引

number = 7 -> index = 4 number = 2 -> index = 9 number = 8 -> index = 7 number = 9 -> index = 1
找到它的最快方法是什么
到目前为止，我已经使用了线性搜索，但对于我的需要来说，这太低效了，因为即使这个数组只有255个长度，值也会被搜索几百万次

我需要一些与java中使用的TreeSet.floor（E）相同的东西。我想使用Dictionary，但我不知道它是否能找到我需要的第一个较小或相等的值。
对数组排序，然后进行二进制搜索以找到值
见：

及

如果未对其进行排序（或保存在数据结构中，其中成员之间存在有助于搜索的关系），则必须检查每个成员以找到正确的成员
最简单的解决方案可能是对其进行排序，然后进行二进制切分/搜索，以找到符合条件的元素

如果您希望仍然能够获得未排序数组的效率，请在数组的某个位置维护一个
sorted
标志（即，将整个内容转换为包含指示符和数组的类），以指示列表已排序
然后，只要阵列发生更改，就将此标志设置为
false
在要执行搜索的位置，首先检查
sorted
标志，如果数组设置为
false
（作为该过程的一部分，将其设置为
true
），则对数组进行排序。如果标志为
true
，只需绕过排序即可
这样，您只能在需要时进行排序。如果数组自上次排序后没有更改，则没有必要重新排序
如果用户需要，还可以维护原始的未排序列表，将已排序列表作为附加数组与类一起保留（类化数组的另一个优点）。这样，你就不会失去任何东西。您有原始的未接触的数据供用户获取，并且有一种快速有效地找到所需元素的方法
然后，对象（排序时）将包含：

int[] lows = {0,9,0,0,5,0,0,8,4,1,3,0,0,0,0}; int[] sortedlows = {0,0,0,0,0,0,0,0,0,1,3,4,5,8,9}; boolean isSorted = true;
如果您随后将该对象[0]更改为
3
，您将得到：

int[] lows = {3,9,0,0,5,0,0,8,4,1,3,0,0,0,0}; int[] sortedlows = {0,0,0,0,0,0,0,0,0,1,3,4,5,8,9}; boolean isSorted = false;
指示在搜索
sortedLows
之前需要进行排序

请记住，将其转换为类不是必需的。如果您担心它的性能（特别是通过getter方法访问数组元素），您可以维护数组并标记自己，同时仍然允许直接访问未排序的数组。您只需确保代码中更改数组的每个位置都正确设置了标志

但在走这条路之前，您应该衡量性能。基于类的方法“更安全”，因为对象本身控制整个过程。
首先，规范化数据：

public static Dictionary<int, int> GetNormalised(int[] data) { var normalised = data.Select((value, index) => new { value, index }) .GroupBy(p => p.value, p => p.index) .Where(p => p.Key != 0) .OrderBy(p => p.Key) .ToDictionary(p => p.Key, p => p.Min()); return normalised; }
要优化所有使用结果的性能缓存。
要在C#中模拟Java树集，请使用C#类：SortedDictionary或SortedSet；Java树集合中的mock floor方法，使用LINQ方法，获得最小值。 SortedSet数据=新的SortedSet（）；数据。其中（p=>pp）。取（1）
基于HackerRank最小成本算法，Java树集实现解决方案运行良好，但C#SortedDictionary、SortedSet超时。
详细信息，请参见我的编码博客：
因此，C#SortedSet类GetViewBetween也可以做同样的事情。看

这里有一个帖子：
因为这要做几百万次，如果数组保存恒定的数据，并且输入的数字范围有限，那么缓存（可能是预构建的）也可能很有用——在缓存命中时是O（1）。缓存将映射范围内所有数字的编号->索引。对于某些范围，可以构造基数/计数排序算法来创建映射。在.NET库中有一个简单的树集，但它是内部的。有趣的是，公共排序是它的基类。关于
0
，它被忽略了吗？因为，在你的例子中，当你搜索<代码> 7 <代码>时，你不会考虑第一个较小的数字，它是第一个代码< 0 > /Cord>，跳过所有的下一个<代码> 0s/COD>。我喜欢你的解决方案，但不幸的是它更慢。
public static Dictionary<int, int> GetNormalised(int[] data) { var normalised = data.Select((value, index) => new { value, index }) .GroupBy(p => p.value, p => p.index) .Where(p => p.Key != 0) .OrderBy(p => p.Key) .ToDictionary(p => p.Key, p => p.Min()); return normalised; }

public static int GetNearest(Dictionary<int, int> normalised, int value) { var res = normalised.Where(p => p.Key <= value) .OrderBy(p => value - p.Key) .Select(p => (int?)p.Value) .FirstOrDefault(); if (res == null) { throw new ArgumentOutOfRangeException("value", "Not found"); } return res.Value; }

[TestMethod] public void GetNearestTest() { var data = new[] { 0, 9, 0, 0, 5, 0, 0, 8, 4, 1, 3, 0, 0, 0, 0 }; var normalised = Program.GetNormalised(data); var value = 7; var expected = 4; var actual = Program_Accessor.GetNearest(normalised, value); Assert.AreEqual(expected, actual); value = 2; expected = 9; actual = Program_Accessor.GetNearest(normalised, value); Assert.AreEqual(expected, actual); value = 8; expected = 7; actual = Program_Accessor.GetNearest(normalised, value); Assert.AreEqual(expected, actual); value = 9; expected = 1; actual = Program_Accessor.GetNearest(normalised, value); Assert.AreEqual(expected, actual); }