Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/algorithm/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 区间树中的查询太慢_Python_Algorithm_Interval Tree - Fatal编程技术网

Python 区间树中的查询太慢

Python 区间树中的查询太慢,python,algorithm,interval-tree,Python,Algorithm,Interval Tree,我有一个区间列表,需要返回与查询中传递的区间重叠的区间。特别的是,在一个典型的查询中,大约三分之一甚至一半的时间间隔将与查询中给定的时间间隔重叠。此外,最短间隔与最长间隔的比率不超过1:5。我实现了自己的间隔树(扩展的红黑树)——我不想使用现有的实现,因为我需要对闭合间隔和一些特殊特性的支持。我用6000个间隔的树中的6000个查询测试了查询速度(因此n=6000和m=3000(应用程序))。事实证明,蛮力和使用树一样好: Computation time - loop: 125.220461

我有一个区间列表,需要返回与查询中传递的区间重叠的区间。特别的是,在一个典型的查询中,大约三分之一甚至一半的时间间隔将与查询中给定的时间间隔重叠。此外,最短间隔与最长间隔的比率不超过1:5。我实现了自己的间隔树(扩展的红黑树)——我不想使用现有的实现,因为我需要对闭合间隔和一些特殊特性的支持。我用6000个间隔的树中的6000个查询测试了查询速度(因此n=6000和m=3000(应用程序))。事实证明,蛮力和使用树一样好:

Computation time - loop: 125.220461 s
Tree setup: 0.05064 s
Tree Queries: 123.167337 s
让我用渐近分析。n:查询数;n:间隔数;应用程序。n/2:查询中返回的间隔数:

时间复杂度蛮力:n*n

时间复杂度树:n*(log(n)+n/2)-->1/2nn+nlog(n)-->n*n

所以结果是,对于一个大的n,这两者应该大致相同。尽管如此,如果n*n前面有1/2的常数,人们还是会认为这棵树的速度会明显加快。因此,对于我得到的结果,我可以想象有三个可能的原因:

a) 我的实现是错误的。(我应该像下面那样使用BFS吗?) b) 我的实现是正确的,但我让Python变得麻烦,因此处理树比处理暴力需要更多的时间。 c) 一切都是好的-这只是一个大n的事情应该如何表现

我的查询函数如下所示:

from collections import deque

def query(self,low,high):
    result = []
    q = deque([self.root]) # this is the root node in the tree
    append_result = result.append
    append_q = q.append
    pop_left = q.popleft
    while q:
        node = pop_left() # look at the next node
        if node.overlap(low,high): # some overlap?
            append_result(node.interval)
        if node.low != None and low <= node.get_low_max(): # en-q left node
            append_q(node.low)                
        if node.high != None and node.get_high_min() <= high: # en-q right node
            append_q(node.high)
def build(self, intervals):
    """
    Function which is recursively called to build the tree.
    """
    if intervals is None:
        return None

    if len(intervals) > 2: # intervals is always sorted in increasing order
        mid = len(intervals)//2
        # split intervals into three parts:
        # central element (median)
        center = intervals[mid]
        # left half (<= median)
        new_low = intervals[:mid]
        #right half (>= median)
        new_high = intervals[mid+1:]
        #compute max on the lower side (left):
        max_low = max([n.get_high() for n in new_low])
        #store min on the higher side (right):
        min_high = new_high[0].get_low()

    elif len(intervals) == 2:
        center = intervals[1]
        new_low = [intervals[0]]
        new_high = None
        max_low = intervals[0].get_high()
        min_high = None

    elif len(intervals) == 1:
        center = intervals[0]
        new_low = None
        new_high = None
        max_low = None
        min_high = None

    else:
        raise Exception('The tree is not behaving as it should...')

    return(Node(center, self.build(new_low),self.build(new_high),
                max_low, min_high))
class Node:
    def __init__(self, interval, low, high, max_low, min_high):
        self.interval = interval # pointer to corresponding interval object
        self.low = low # pointer to node containing intervals to the left
        self.high = high # pointer to node containing intervals to the right
        self.max_low = max_low # maxiumum value on the left side
        self.min_high = min_high # minimum value on the right side
def subtree(current):
    node_list = []
    if current.low != None:
        node_list += subtree(current.low)
    node_list += [current]
    if current.high != None:
        node_list += subtree(current.high)
    return node_list
子树中的所有节点都可以这样获得:

from collections import deque

def query(self,low,high):
    result = []
    q = deque([self.root]) # this is the root node in the tree
    append_result = result.append
    append_q = q.append
    pop_left = q.popleft
    while q:
        node = pop_left() # look at the next node
        if node.overlap(low,high): # some overlap?
            append_result(node.interval)
        if node.low != None and low <= node.get_low_max(): # en-q left node
            append_q(node.low)                
        if node.high != None and node.get_high_min() <= high: # en-q right node
            append_q(node.high)
def build(self, intervals):
    """
    Function which is recursively called to build the tree.
    """
    if intervals is None:
        return None

    if len(intervals) > 2: # intervals is always sorted in increasing order
        mid = len(intervals)//2
        # split intervals into three parts:
        # central element (median)
        center = intervals[mid]
        # left half (<= median)
        new_low = intervals[:mid]
        #right half (>= median)
        new_high = intervals[mid+1:]
        #compute max on the lower side (left):
        max_low = max([n.get_high() for n in new_low])
        #store min on the higher side (right):
        min_high = new_high[0].get_low()

    elif len(intervals) == 2:
        center = intervals[1]
        new_low = [intervals[0]]
        new_high = None
        max_low = intervals[0].get_high()
        min_high = None

    elif len(intervals) == 1:
        center = intervals[0]
        new_low = None
        new_high = None
        max_low = None
        min_high = None

    else:
        raise Exception('The tree is not behaving as it should...')

    return(Node(center, self.build(new_low),self.build(new_high),
                max_low, min_high))
class Node:
    def __init__(self, interval, low, high, max_low, min_high):
        self.interval = interval # pointer to corresponding interval object
        self.low = low # pointer to node containing intervals to the left
        self.high = high # pointer to node containing intervals to the right
        self.max_low = max_low # maxiumum value on the left side
        self.min_high = min_high # minimum value on the right side
def subtree(current):
    node_list = []
    if current.low != None:
        node_list += subtree(current.low)
    node_list += [current]
    if current.high != None:
        node_list += subtree(current.high)
    return node_list

p、 请注意,通过利用存在如此多的重叠以及所有间隔具有可比长度,我成功实现了一种基于排序和二分法的简单方法,该方法在80秒内完成,但我认为这是过度拟合。。。有趣的是,通过使用渐近分析,我发现它应该有应用程序。与使用树相同的运行时…

如果我正确理解您的问题,您正在尝试加快您的进程。 如果是这样,请尝试创建一个真正的树,而不是操纵列表

看起来像:

class IntervalTreeNode():
    def __init__(self, parent, min, max):
        self.value      = (min,max)
        self.parent     = parent

        self.leftBranch = None
        self.rightBranch= None

    def insert(self, interval):
        ...

    def asList(self):
        """ return the list that is this node and all the subtree nodes """
        left=[]
        if (self.leftBranch != None):
            left = self.leftBranch.asList()
        right=[]
        if (self.rightBranch != None):
            left = self.rightBranch.asList()
        return [self.value] + left + right
然后在开始时创建一个internalTreeNode并将您的所有间隔插入。
这样,如果您确实需要一个列表,那么您可以在每次需要结果时构建一个列表,而不是每次在递归迭代中使用
[x://code>或
[:x]
执行步骤时构建一个列表,因为在python中,列表操作是一项代价高昂的操作。也可以直接使用节点而不是列表来工作,这将大大加快进程,因为您只需返回对节点的引用,而不必添加列表。

我添加了更多关于如何构建树的代码(在“编辑”下)。我想我已经按照你的建议做了一些事情。很难把所有的代码放在一个帖子里。。。build方法只是用来构造一个带有节点的树,tree.root位于顶部。请告诉我是否出了什么问题。渐近分析几乎没有告诉任何关于实际运行时间的信息:)另外,你试过使用分析器吗?