Python 如何快速找到列表中与条件匹配的第一个元素？_Python_Optimization

Python 如何快速找到列表中与条件匹配的第一个元素？

python optimization

Python 如何快速找到列表中与条件匹配的第一个元素？,python,optimization,Python,Optimization,我在一个大型程序中有两个子例程，它们在列表中搜索与条件匹配的第一项并返回其索引。出于某种原因，这些占程序执行时间的很大一部分，我想知道如何优化它们 def next_unignored(start_index, ignore, length): """Takes a start index, a set of indices to ignore, and the length of the list and finds the first index not

我在一个大型程序中有两个子例程，它们在列表中搜索与条件匹配的第一项并返回其索引。出于某种原因，这些占程序执行时间的很大一部分，我想知道如何优化它们

def next_unignored(start_index, ignore, length):
    """Takes a start index, a set of indices to ignore, 
       and the length of the list and finds the first index
       not in the set ignore."""
    for i in xrange(start_index+1, length):
        if i not in ignore:
            return i

def last_unignored(start_index, ignore, lower_limit):
    """The same as the other, just backwards."""
    for i in xrange(start_index-1, lower_limit-1, -1):
        if i not in ignore:
            return i

有没有人有优化这些例程或其他遵循类似模式的例程的技巧

输入示例：

from sample import next_unignored, last_unignored

start_index = 0
ignore = set([0,1,2,3,5,6,8])
length = 100
print next_unignored(start_index, ignore, length) # => 4

start_index = 8
ignore = set([0,1,2,3,5,6,8])
lower_limit = 0
print last_unignored(start_index, ignore, lower_limit) # => 7

没有屁股

>>> start,length = 0,100
>>> ignore = set([0,1,2,3,5,6,8])
>>> my_list = set(range(start,start+length))
>>> my_list2 = my_list - ignore
>>>
>>> print max(my_list2) # highest number not in ignore list
99
>>> print min(my_list2)  #lowest value not in ignore list
4

如果忽略集是稀疏的，则方法可能会更快，因为它会短路，但如果忽略集是密集的，则应该会大大加快速度

如果要查看的列表已排序，则可以重写函数以分解问题，就像使用二进制搜索一样。例如，如果要在某个长度的列表中查找该值的下一个实例，可以这样处理：

if(value > (length/2)):
    start_index == (length/2)
elif(value <= (length/2)):
    start_index == length
    length == (length/2)
next_unignored(start_index, ignore, length)

if（值>（长度/2））：
开始索引==（长度/2）
elif（值为什么所有的集合都是这样？从开始向上或向下计数，直到发现第一个元素不在忽略：
def next_unignored(start_index, ignore, length):
    while start_index < length:
        if start_index not in ignore:
            return start_index
        start_index += 1

def last_unignored(start_index, ignore, lower_limit):
    """The same as the other, just backwards."""
    while start_index >= lower_limit:
        if start_index not in ignore:
            return start_index
        start_index -= 1

def next_unignored（开始索引、忽略、长度）：
当开始索引<长度时：
如果开始索引不在忽略中：
返回起始索引
开始索引+=1
def上次未识别（开始索引、忽略、下限）：
“和另一个一样，只是向后。”
当开始索引>=下限时：
如果开始索引不在忽略中：
返回起始索引
开始索引-=1

如果所有索引都在ignore
中，这些将返回None
。什么是ignore
。是列表、集合还是其他？等等，这样你想要提示，但不想让其他人得到提示？指向实际问题的链接可能会帮助你得到答案……而且这只是部分代码……请发布一个小的可运行考试请使用示例输入和预期输出完成”（我不想命名它，这样其他人就无法通过谷歌搜索并获得答案提示）这与本网站的提示正好相反：默认情况下，如果您在函数末尾不返回，函数将返回None
。这意味着使用returnnone
根本不起任何作用。并不是说删除它将大大提高速度，而是节省一行代码。为了更好地反映这一点，我改变了这个问题我的意图是，让问题更一般化，并将其与我正在使用的程序分离。我明白你的观点，格尼布尔。相反，我可以切换开始和长度，还是有更多的内容需要更改？可能是x[~np.in1d（x，ignore）][-1]
获取不在忽略列表中的最后一个元素通常numpy是我所有性能问题的解决方案，但根据我的测试，这里的速度实际上要慢得多。请参阅我关于速度的评论。numpy通常适用于非常大的数据集…较小的数据集通常速度较慢这是n*m复杂度…一个简单的集合交叉点（或差异）应该是n…对于大型集合和列表，这可能是一个非常显著的加速（因为两个列表都是由独特的元素组成的，所以实际上你有一个集合）（尽管如果忽略集合是稀疏的，他的短路可能会有所帮助）ignore不是一组要忽略的元素，而是一组要忽略的索引。列表的索引总是被排序的。我很确定这些函数与问题描述中的函数相同，只是for循环在您的版本中实现为带有计数器的while循环。是的，它们是。但是没有numpy，也没有额外创建I畸胎，只是加法和减法。
def next_unignored(start_index, ignore, length):
    while start_index < length:
        if start_index not in ignore:
            return start_index
        start_index += 1

def last_unignored(start_index, ignore, lower_limit):
    """The same as the other, just backwards."""
    while start_index >= lower_limit:
        if start_index not in ignore:
            return start_index
        start_index -= 1