Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/selenium/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 通过列表项进行高效搜索_Python - Fatal编程技术网

Python 通过列表项进行高效搜索

Python 通过列表项进行高效搜索,python,Python,我有一个列表lst(包含10K个项目)和查询项q,我想知道lst中是否有任何项目以q结尾 作为参考计时器,我将其设置为1,此语句: x = q in lst 我试过这些: # obvious endswith method y = [k for k in lst if k.endswith(q)] # find method z = [k for k in lst if k.find(q, len(k)-len(q))] # regex v = [k for k in lst if re.se

我有一个列表
lst
(包含10K个项目)和查询项
q
,我想知道
lst
中是否有任何项目以
q
结尾

作为参考计时器,我将其设置为1,此语句:

x = q in lst
我试过这些:

# obvious endswith method
y = [k for k in lst if k.endswith(q)]
# find method
z = [k for k in lst if k.find(q, len(k)-len(q))]
# regex
v = [k for k in lst if re.search(q + '$', k)]
# regex without list comprehension
w = re.search(q + '~', '~'.join(lst) + '~')
使用这些结果(根据
x
timer进行计时):

所以我想我可以使用regex和joined list,除非有更好的实现


在现实世界中,我试图优化在执行时多次命中的代码块,我发现使用
.endswith
方法理解列表是一个瓶颈。

我不认为正则表达式是可行的方法。即使我将
joined='~'.join(lst)+'~'
分配到循环之外,joined中的
q+'~'仍优于
re.search(q+'~',joined)
(0.00093秒vs 0.0034秒)

但是,假设您还没有连接的字符串,那么不需要它的方法可能会更快。生成器可能很有用,因为它只在您需要时生成值(这样,一旦您在某个项的末尾找到查询,您就可以停止,而不是检查列表的其余部分)

这对我来说是最快的:
any(如果k.endswith(q))

我的代码:

import timeit

setup = '''

import string
import random
import re

lst = []
for i in range(10000):
    lst.append(random.choice(string.letters)+random.choice(string.letters)+random.choice(string.letters)+random.choice(string.letters))

q = 'ab'

'''

print "reference: "
print round(min(timeit.Timer("q in lst", setup=setup).repeat(7,500)),5)
# 0.05435

print "\nreference with joined string: "
print round(min(timeit.Timer("q+'~' in '~'.join(lst) + '~'", setup=setup).repeat(7,500)),5)
# 0.05462

print "\nendswith, with list approach: "
print round(min(timeit.Timer("any([k for k in lst if k.endswith(q)])", setup=setup).repeat(7,500)),5)
# 0.62998

print "\nfind method: "
print round(min(timeit.Timer("[k for k in lst if k.find(q, len(k)-len(q))]", setup=setup).repeat(7,500)),5)
# 1.22274

print "\nregex: "
print round(min(timeit.Timer("[k for k in lst if re.search(q + '$', k)]", setup=setup).repeat(7,500)),5)
# 3.73494

print "\nregex without list comprehension: "
print round(min(timeit.Timer("re.search(q + '~', '~'.join(lst) + '~')", setup=setup).repeat(7,500)),5)
# 0.05435

print "\nendswith, with generator approach: "
print round(min(timeit.Timer("any((k for k in lst if k.endswith(q)))", setup=setup).repeat(7,500)),5)
# 0.02052

您是否只想查找
lst
中是否有任何项以
q
结尾,还是需要一个以
q
结尾的项列表?只想查找是否有这样的项-true/false
“~”。regex搜索中的join(lst)
可以在循环外分配,这将使regex搜索提高3倍,在循环中使用这种搜索时。非常好。我不知何故忘记了joined中明显的
q+'~,在我将结果与代码中的生成器进行比较后,我发现这是最好的方法。谢谢:)生成器中的
any()
是个不错的主意,但我的循环中通常没有命中,所以它不如joined中的
q+'~”有效
import timeit

setup = '''

import string
import random
import re

lst = []
for i in range(10000):
    lst.append(random.choice(string.letters)+random.choice(string.letters)+random.choice(string.letters)+random.choice(string.letters))

q = 'ab'

'''

print "reference: "
print round(min(timeit.Timer("q in lst", setup=setup).repeat(7,500)),5)
# 0.05435

print "\nreference with joined string: "
print round(min(timeit.Timer("q+'~' in '~'.join(lst) + '~'", setup=setup).repeat(7,500)),5)
# 0.05462

print "\nendswith, with list approach: "
print round(min(timeit.Timer("any([k for k in lst if k.endswith(q)])", setup=setup).repeat(7,500)),5)
# 0.62998

print "\nfind method: "
print round(min(timeit.Timer("[k for k in lst if k.find(q, len(k)-len(q))]", setup=setup).repeat(7,500)),5)
# 1.22274

print "\nregex: "
print round(min(timeit.Timer("[k for k in lst if re.search(q + '$', k)]", setup=setup).repeat(7,500)),5)
# 3.73494

print "\nregex without list comprehension: "
print round(min(timeit.Timer("re.search(q + '~', '~'.join(lst) + '~')", setup=setup).repeat(7,500)),5)
# 0.05435

print "\nendswith, with generator approach: "
print round(min(timeit.Timer("any((k for k in lst if k.endswith(q)))", setup=setup).repeat(7,500)),5)
# 0.02052