Algorithm 用于根据重叠范围快速检查数字的数据结构_Algorithm_Data Structures

Algorithm 用于根据重叠范围快速检查数字的数据结构

algorithm data-structures

Algorithm 用于根据重叠范围快速检查数字的数据结构,algorithm,data-structures,Algorithm,Data Structures,假设我有以下数字范围： 0-500 0-100 75-127 125-157 130-198 198-200 现在，让我们假设我需要能够检查任何给定的数字，看看它在哪个范围内。什么样的数据结构可以最有效地判断数字100属于0-500、0-100和75-127的范围？我是否只需要一个包含起始值的二叉树？在这种情况下，树中的每个节点是否可以在该起始点保存包含每个范围的多个对象请注意，我只需要检索这个特定的应用程序，我并不认为自己需要在过程中对其进行修改，所以检索速度是目前为止我的首要任务谢谢让

假设我有以下数字范围：

0-500 0-100 75-127 125-157 130-198 198-200

现在，让我们假设我需要能够检查任何给定的数字，看看它在哪个范围内。什么样的数据结构可以最有效地判断数字100属于0-500、0-100和75-127的范围？我是否只需要一个包含起始值的二叉树？在这种情况下，树中的每个节点是否可以在该起始点保存包含每个范围的多个对象

请注意，我只需要检索这个特定的应用程序，我并不认为自己需要在过程中对其进行修改，所以检索速度是目前为止我的首要任务

谢谢

让

表示可能的范围数。（例如

R=6

）

创建

哈希表，使每个哈希表只能包含一个范围。对于您的示例，您需要创建6个哈希表。第一个哈希表

R1

将只包含0-500之间的值

填充哈希表

每个数字都将放入相应的哈希表中。例如，编号

将进入

R1

，

R2

，

R3

。如果

很大，则需要创建大量哈希表。但是，总空间由存储在所有哈希表中的实际数据限定

检索：

对于任何给定的数字，请检查它是否存在于每个

哈希表中。您可以通过选择要查看的哈希表来进一步优化。例如，对于

，您只需要查看6个哈希表中的3个

时间复杂性：

在哈希表中搜索单个值平均需要

恒定时间

。所以摊销

O（1）

以查看哈希表中是否存在数字

摊销

O（R）

以生成输出，因为我们需要查看所有哈希表以生成输出。区间树是一个非常普遍的概念，对于每个问题，节点中的数据都会略有不同。在您的情况下，每个节点都会保留一个包含节点所代表的间隔的输入间隔列表。

编辑：

快得多
内存消耗效率更高
对于1000个输入，它保证在最坏的情况下在不到100毫秒的时间内工作
对于10000个输入范围，10x大于所需输入，它仍然可以在不到100毫秒的时间内工作，用于从ex:{0-}x50、{1-}x50等复制的
```
。。。适用于所有范围内的
```


目标
给定10000个输入范围间隔（从-到-使情况更糟，每个“从”重复50次）。程序获取一个目标值并显示目标所属的所有范围
关于它如何在4个范围内工作的示例：
鉴于以下范围：
0 - 100
0 - 500
50 - 500
20 - 300

目标：40
输出：
20 - 300
0 - 500
0 - 100

算法解释（使用前面的示例）：
每个from
值都映射到从index=0开始的index++
。因此，在我们的例子中：
From => index
0 => 0
50 => 1
20 => 2

现在有一个名为指针的树集数组，该数组的每个索引i
都引用树集中值i
的键。因此指针[0]
指的是“发件人”：0，指针[1]
指的是“发件人”：50，指针[2]
指的是“发件人”：20
现在，我们通过查看树映射'key'=>'value'
将添加到值，分别从值添加到每个，其中值
是指针
数组中键的索引
此外，我们希望在每个索引的指针
数组中添加的to
值按降序排序（稍后我将解释原因）
现在指针变成这样：
index => TreeSet[values..]
0 => 500 | 100
1 => 500
2 => 300

现在我们准备好获取目标所属的范围
对于target=40

1-在树状图中搜索最近的楼层关键点40。程序发现20是最接近的一个
2-它转到指针数组中对应于20的索引。要获得20的索引，请查看键20处的树映射。20的指数是2
3-现在转到指针[2]
，它发现有数字300
4-现在应用程序检查300是否小于目标，之所以这样做，是因为我前面提到过，创建的每个树集都是按降序排序的。因此，如果300小于目标值，则无需继续检查指针[2]
中的下一个值，因为可以保证它们较小
5-在这种情况下，300大于目标值，然后将键打印为from
和指针[2]{current element}
为to

6-由于在指针[2]
中只找到一个元素，因此for循环将退出，并且该键将从树映射中删除，因此下次程序要查找下一个最近的楼层键时，它将查找下一个（如果存在）
7-（while循环的下一次迭代）删除键20后找到的下一个键是0，根据树映射索引为0
8-转到指针[0]
。它发现指针[0]
处树集中的元素数为2
9-从第一个元素指针[0]{first element}
开始。500比40小吗？否，打印“范围”。下一个元素，100小于40吗？否，打印“范围”。没有更多元素退出循环
10-从树映射中删除键0
11-现在While循环条件检查目标是否存在最近的楼层关键点。否，因为已删除0和20。因此，条件不满足时，退出循环
十二,-
import java.util.Collections;
import java.util.TreeMap;
import java.util.TreeSet;

public class Interval {
    public static void main(String[] args) {
        int MAX_SIZE = 10000;
        int[] from = new int[MAX_SIZE];
        int[] to = new int[MAX_SIZE];

        //Generate 10,000 (from - to), with 50 redundant (from range) for every value of from. (to make the application heavy)
        int c = 0;
        for(int i=0; i<MAX_SIZE;i++){
            from[i] = 0+c;
            to[i] = from[i] + (int)(Math.random()*100);
            if(i%50 == 0) 
                c++;
        }

        //Start time counting
        long time = System.currentTimeMillis();

        int target = 97;
        TreeMap<Integer, Integer> treePointer = new TreeMap<Integer, Integer>(); // will sotre <From, index++>

        int index = 0;
        int size = from.length;
        TreeSet<Integer>[] pointers = new TreeSet[size]; //Array of tree set to store values of every "from" range

        //insert into tree
        for(int i=0; i<from.length;i++){
            if(!treePointer.containsKey(from[i])){ //if the "from" does not exist in the tree yet, insert it
                treePointer.put(from[i], index);
                pointers[index++] = new TreeSet<Integer>(Collections.reverseOrder()); //sort descending order
            }

            //index of "from" in the pointers array
            int virtualIndex = treePointer.get(from[i]);
            //add the 'to' element to the corresponding index of "from" in Tree Set at pointers[index of "from"]
            pointers[virtualIndex].add(to[i]);
        }

        // Display part of the pointers array to understand how elements are stored
//      for(int i=0; i<10; i++){
//          for(int current : pointers[i]){
//              System.out.print(current + " ");
//          }
//          System.out.println();
//      }

        //Start checking for the ranges
        Integer currentKey = -1; //dummy number at first
        while((currentKey = treePointer.floorKey(target)) != null){ // while there is a closest floor key
          //get index assigned to the closest floor key
            int virtualIndex = treePointer.get(currentKey);
            //loop on the elements found at pointers[index of closest floor number]
            for(int toValue : pointers[virtualIndex]){
                if(toValue < target) //remember the values are sorted in a descending order, so whenever the value becomes smaller than the target don't continue the for loop
                    break;
                System.out.println(currentKey + " - " + toValue); // else target less or equal to toValue, so print range
            }
            treePointer.remove(currentKey); //remove key from tree to fetch the next floor key in the next while iteration
        }
        //Display time consumed in ms
        System.out.println("\nTotal time: " + (System.currentTimeMillis() - time) + " ms");
    }
}

# (Inclusive) ranges
ranges = [(0,500), (0,100), (75,127), (125,157), (130,198), (198,200)]
smallest = min(r[0] for r in ranges)
largest  = max(r[1] for r in ranges)

# Ceate table
table = [[] for i in range(smallest, largest+1)] # List of lists
for r in ranges: # pre-compute results
    mn, mx = r
    for index in range(mn, mx+1):
        table[index - smallest].append(r)

def check(n):
    'Return list of ranges containing n'
    if smallest <= n <= largest:
        return table[n - smallest]
    else:
        return []   # Out of range

for n in [-10, 10, 75, 127, 129, 130, 158, 197, 198, 199, 500, 501]:
    print('%3i is in groups: %r' % (n, check(n)))

-10 is in groups: []
 10 is in groups: [(0, 500), (0, 100)]
 75 is in groups: [(0, 500), (0, 100), (75, 127)]
127 is in groups: [(0, 500), (75, 127), (125, 157)]
129 is in groups: [(0, 500), (125, 157)]
130 is in groups: [(0, 500), (125, 157), (130, 198)]
158 is in groups: [(0, 500), (130, 198)]
197 is in groups: [(0, 500), (130, 198)]
198 is in groups: [(0, 500), (130, 198), (198, 200)]
199 is in groups: [(0, 500), (198, 200)]
500 is in groups: [(0, 500)]
501 is in groups: []

# (Inclusive) ranges
ranges = [(0,500), (0,100), (75,127), (125,157), (130,198), (198,200)]
limit = 1000000 # Or whatever

smallest = min(r[0] for r in ranges)
largest  = max(r[1] for r in ranges)

if (largest - smallest) * len(ranges) < limit:
    # Ceate table
    table = [[] for i in range(smallest, largest+1)] # List of lists
    for r in ranges:
        mn, mx = r
        for index in range(mn, mx+1):
            table[index - smallest].append(r)

    def check(n):
        'Return list of ranges containing n'
        if smallest <= n <= largest:
            return table[n - smallest]
        else:
            return []   # Out of range
else:
    # mpre emory efficient method, for example
    def check(n):
        return [(mn, mx) for mn, mx in ranges if mn <= n <= mx]

for n in [-10, 10, 75, 127, 129, 130, 158, 197, 198, 199, 500, 501]:
    print('%3i is in groups: %r' % (n, check(n)))