Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在一组间隔内有效地计算一组数字的存在_Python_List - Fatal编程技术网

Python 如何在一组间隔内有效地计算一组数字的存在

Python 如何在一组间隔内有效地计算一组数字的存在,python,list,Python,List,输入参数是表示区间的元组列表和整数列表。目标是编写一个函数,计算每个整数出现的间隔数,并将此结果作为关联数组返回。例如: Input intervals: [(1, 3), (5, 6), (6, 9)] Input integers: [2, 4, 6, 8] Output: {2: 1, 4: 0, 6: 2, 8: 1} Input intervals: [(3, 3), (22, 30), (17, 29), (7, 12), (12, 34), (18, 38), (30, 40),

输入参数是表示区间的元组列表和整数列表。目标是编写一个函数,计算每个整数出现的间隔数,并将此结果作为关联数组返回。例如:

Input intervals: [(1, 3), (5, 6), (6, 9)]
Input integers: [2, 4, 6, 8]
Output: {2: 1, 4: 0, 6: 2, 8: 1}
Input intervals: [(3, 3), (22, 30), (17, 29), (7, 12), (12, 34), (18, 38), (30, 40), (5, 27), (19, 26), (27, 27), (1, 31), (17, 17), (22, 25), (6, 14), (5, 7), (9, 19), (24, 28), (19, 40), (9, 36), (2, 32)]
Input numbers: [16, 18, 39, 40, 27, 28, 4, 23, 15, 24, 2, 6, 32, 17, 21, 29, 31, 7, 20, 10]
Output: {2: 2, 4: 2, 6: 5, 7: 6, 10: 7, 15: 6, 16: 6, 17: 8, 18: 8, 20: 9, 21: 9, 23: 11, 24: 12, 27: 11, 28: 9, 29: 8, 31: 7, 32: 6, 39: 2, 40: 2}
Input intervals: [(1, 3), (5, 6), (6, 9)]
Input integers: [2, 4, 6, 8]

Unified list (sorted):
[(1,-1), (2,0), (3,1), (4,0), (5,-1), (6, -1), (6,0), (6,1), (8,0), (9,1)]

Running sum:
[-1    , -1,    0,     0,      -1,    -2,      0,      -1,    -1,   0]

Values for integers:
2: 1, 4: 0, 6: 2, 8, 1
其他例子:

Input intervals: [(1, 3), (5, 6), (6, 9)]
Input integers: [2, 4, 6, 8]
Output: {2: 1, 4: 0, 6: 2, 8: 1}
Input intervals: [(3, 3), (22, 30), (17, 29), (7, 12), (12, 34), (18, 38), (30, 40), (5, 27), (19, 26), (27, 27), (1, 31), (17, 17), (22, 25), (6, 14), (5, 7), (9, 19), (24, 28), (19, 40), (9, 36), (2, 32)]
Input numbers: [16, 18, 39, 40, 27, 28, 4, 23, 15, 24, 2, 6, 32, 17, 21, 29, 31, 7, 20, 10]
Output: {2: 2, 4: 2, 6: 5, 7: 6, 10: 7, 15: 6, 16: 6, 17: 8, 18: 8, 20: 9, 21: 9, 23: 11, 24: 12, 27: 11, 28: 9, 29: 8, 31: 7, 32: 6, 39: 2, 40: 2}
Input intervals: [(1, 3), (5, 6), (6, 9)]
Input integers: [2, 4, 6, 8]

Unified list (sorted):
[(1,-1), (2,0), (3,1), (4,0), (5,-1), (6, -1), (6,0), (6,1), (8,0), (9,1)]

Running sum:
[-1    , -1,    0,     0,      -1,    -2,      0,      -1,    -1,   0]

Values for integers:
2: 1, 4: 0, 6: 2, 8, 1
我该如何编写一个有效地实现这一点的函数?我已经有了O(nm)实现,n是区间数,m是整数数,但是我正在寻找更有效的方法

我现在所拥有的:

def intervals_per_number(numbers, intervals):
    result_map = {i: 0 for i in numbers}
    for i in result_map.keys():
        for k in intervals:
            if k[0] <= i <= k[1]:
                result_map[i] += 1
    return result_map
def interval_per_number(数字、间隔):
结果_map={i:0表示数字中的i}
对于结果中的i_map.keys():
对于k,间隔为:

如果k[0]您可以对
整数进行预排序,然后在下限上使用。排序的复杂度为O(M*log(M)),而对分的复杂度为O(log(M))。所以实际上你有O(max(M,N)*log(M))

导入对分 从集合导入defaultdict 结果=defaultdict(int) 整数=已排序(整数) 对于低间隔、高间隔: 索引=对分。左对分(整数,低)
而index 接下来,对列表进行排序

现在,您可以浏览该列表,维护成对的第二个元素的运行总和。当您看到第二个元素为0的对时,记录该整数的运行和(求反)

在最坏的情况下,它在O((N+M)log(N+M))时间内运行(实际上,我想如果查询和间隔大部分是排序的,那么它将是线性的,这要归功于timsort)

例如:

Input intervals: [(1, 3), (5, 6), (6, 9)]
Input integers: [2, 4, 6, 8]
Output: {2: 1, 4: 0, 6: 2, 8: 1}
Input intervals: [(3, 3), (22, 30), (17, 29), (7, 12), (12, 34), (18, 38), (30, 40), (5, 27), (19, 26), (27, 27), (1, 31), (17, 17), (22, 25), (6, 14), (5, 7), (9, 19), (24, 28), (19, 40), (9, 36), (2, 32)]
Input numbers: [16, 18, 39, 40, 27, 28, 4, 23, 15, 24, 2, 6, 32, 17, 21, 29, 31, 7, 20, 10]
Output: {2: 2, 4: 2, 6: 5, 7: 6, 10: 7, 15: 6, 16: 6, 17: 8, 18: 8, 20: 9, 21: 9, 23: 11, 24: 12, 27: 11, 28: 9, 29: 8, 31: 7, 32: 6, 39: 2, 40: 2}
Input intervals: [(1, 3), (5, 6), (6, 9)]
Input integers: [2, 4, 6, 8]

Unified list (sorted):
[(1,-1), (2,0), (3,1), (4,0), (5,-1), (6, -1), (6,0), (6,1), (8,0), (9,1)]

Running sum:
[-1    , -1,    0,     0,      -1,    -2,      0,      -1,    -1,   0]

Values for integers:
2: 1, 4: 0, 6: 2, 8, 1
示例代码:

def query(qs, intervals):
    xs = [(q, 0) for q in qs] + [(x, -1) for x, _ in intervals] + [(x, 1) for _, x in intervals]
    S, r = 0, dict()
    for v, s in sorted(xs):
        if s == 0:
            r[v] = S
        S -= s
    return r

intervals = [(3, 3), (22, 30), (17, 29), (7, 12), (12, 34), (18, 38), (30, 40), (5, 27), (19, 26), (27, 27), (1, 31), (17, 17), (22, 25), (6, 14), (5, 7), (9, 19), (24, 28), (19, 40), (9, 36), (2, 32)]
queries = [16, 18, 39, 40, 27, 28, 4, 23, 15, 24, 2, 6, 32, 17, 21, 29, 31, 7, 20, 10]
print(query(queries, intervals))
输出:

{2: 2, 4: 2, 6: 5, 7: 6, 10: 7, 15: 6, 16: 6, 17: 8, 18: 8, 20: 9, 21: 9, 23: 11, 24: 12, 27: 11, 28: 9, 29: 8, 31: 7, 32: 6, 39: 2, 40: 2}

根据用例和上下文,一些简单的东西可能就足够了:

from collections import Counter
from itertools import chain

counts = Counter(chain.from_iterable(range(f, t+1) for f,t in input_intervals))
result = {k:counts[k] for k in input_numbers}

O(n*k+m),其中
n
是区间数,
k
是区间的平均大小,
m
是整数数。

仅当列表已排序时:调整二进制搜索。对于这样一个非常小的样本,它不会有多大用处,但对于长度为8或更大的样本,它应该会做得更好(8,因为它最多需要3次查找;对于更大的长度,达到数百万,它只会得到指数级的更好)。@usr2564301如何获得指数级的优于二次型的解决方案?听起来不对。@HeapOverflow:(哦)是的,二次型比线性型好。(“指数”只是2的幂的一个因子,而不是n。但是仍然,非常,非常好。)@usr2564301-Hmm,听起来仍然不正确。它不是O(n log n)而不是O(n^2),所以O(n/log(n))更好吗?也就是说,小于线性更好?@HeapOverflow不,区间和整数并不总是已经排序。看起来您的复杂性分析忘记了
循环。@HeapOverflow确实如此,但这取决于区间的分布。如果它们大部分是不重叠的(或者重叠的数量是恒定的),那么这会增加恒定的时间。