List 查找两个排序列表间共现的算法优化

List 查找两个排序列表间共现的算法优化,list,loops,optimization,language-agnostic,combinations,List,Loops,Optimization,Language Agnostic,Combinations,目前,我的算法(估计)要运行十个多小时才能完成。它现在还在运行,只是为了更好地估计它到底有多糟糕 假设我有一组人p,每个人都有不同长度的事件排序列表,其中I是一个索引变量。我想创建一个图G,这样GPi,Pj=n,其中n是Pi和Pj之间的边权重,表示它们在某个静态范围内同时出现的次数r 我当前的算法是无意识的,并且是用Python实现的(既可读又明确),如下所示:(为了简洁起见,修改自) 我已经考虑过修改这个算法,这样当oB超过当前oA的范围时,循环就会继续到下一个oA 考虑到列表已排序,有没有更

目前,我的算法(估计)要运行十个多小时才能完成。它现在还在运行,只是为了更好地估计它到底有多糟糕

假设我有一组人p,每个人都有不同长度的事件排序列表,其中I是一个索引变量。我想创建一个图G,这样GPi,Pj=n,其中nPiPj之间的边权重,表示它们在某个静态范围内同时出现的次数r

我当前的算法是无意识的,并且是用Python实现的(既可读又明确),如下所示:(为了简洁起见,修改自)

我已经考虑过修改这个算法,这样当
oB
超过当前
oA
的范围时,循环就会继续到下一个
oA


考虑到列表已排序,有没有更好的方法来实现这一点?

一个选项是将B.occurrences放在a中,以便您可以快速查询范围内的所有oB(oA-半径,oA+半径)


另一个选项是索引桶中的B.事件,例如[0,1],[1,2]等。然后通过选择索引为(oA-半径)到(oA+半径)的桶,可以快速找到范围内的所有oB(oA-半径,oA+半径)。桶是近似值,因此您仍然需要迭代验证第一个和最后一个选定桶中的所有oB。

您的想法是,一旦您通过上边界,就移动到下一个
oA
。此外,如果
a.ocurrances
B.ocurrances
的范围与“半径”相比较大,那么,不从
B的开头开始将更有效率。每次发生时:

print '>Generating combinations...',
pairs = combinations(people, 2)
print 'Done'

print 'Finding co-occurences'
radius = 5
for A, B in pairs:
    i = 0
    b = B.occurances
    maxi = len(B.occurances) - 1
    for oA in A.occurances:
        lo = oA - radius
        hi = oA + radius
        while (b[i] > lo) and (i > 0):     # while we're above the low end of the range
            i = i - 1                      #   go towards the low end of the range
        while (b[i] < lo) and (i < maxi):  # while we're below the low end of the range
            i = i + 1                      #   go towards the low end of the range
        if b[i] >= lo:
            while (b[i] <= hi):            # while we're below the high end of the range
                try:                       #   increase edge weight
                    network.edge[A.common_name][B.common_name]['weight'] += 1
                except:
                    network.add_edge(A.common_name, B.common_name, weight=1)

                if i < maxi:               #   and go towards the high end of the range
                    i = i + 1
                else:
                    break
print'>正在生成组合,
成对=组合(人,2)
打印“完成”
打印“查找共同事件”
半径=5
对于A,B成对:
i=0
发生
最大值=len(B.发生率)-1
对于A.事件中的oA:
lo=oA-半径
hi=oA+半径
而(b[i]>lo)和(i>0):#当我们处于该范围的低端之上时
i=i-1#走向范围的低端
而(b[i]=lo:
while(b[i]
print '>Generating combinations...',
pairs = combinations(people, 2)
print 'Done'

print 'Finding co-occurences'
radius = 5
for A, B in pairs:
    i = 0
    b = B.occurances
    maxi = len(B.occurances) - 1
    for oA in A.occurances:
        lo = oA - radius
        hi = oA + radius
        while (b[i] > lo) and (i > 0):     # while we're above the low end of the range
            i = i - 1                      #   go towards the low end of the range
        while (b[i] < lo) and (i < maxi):  # while we're below the low end of the range
            i = i + 1                      #   go towards the low end of the range
        if b[i] >= lo:
            while (b[i] <= hi):            # while we're below the high end of the range
                try:                       #   increase edge weight
                    network.edge[A.common_name][B.common_name]['weight'] += 1
                except:
                    network.add_edge(A.common_name, B.common_name, weight=1)

                if i < maxi:               #   and go towards the high end of the range
                    i = i + 1
                else:
                    break