在python中,为什么string.count()比循环快?
在leetcode中,有一个问题需要检查一系列无序的字符串“U”、“D”、“L”、“R”是否会形成一个圆 我的意见如下:在python中,为什么string.count()比循环快?,python,algorithm,Python,Algorithm,在leetcode中,有一个问题需要检查一系列无序的字符串“U”、“D”、“L”、“R”是否会形成一个圆 我的意见如下: def judgeCircle(moves): l=r=u=d=0 for i in moves: if i == 'L': l+=1 if i == 'D': d+=1 if i == 'R': r+=1 if i ==
def judgeCircle(moves):
l=r=u=d=0
for i in moves:
if i == 'L':
l+=1
if i == 'D':
d+=1
if i == 'R':
r+=1
if i == 'U':
u+=1
return ((l-r)==0) and ((u-d)==0)
法官认为这花了239毫秒
而另一种单线解决方案:
def judgeCircle(moves):
return (moves.count('R')==moves.count('L')) and
(moves.count('U')==moves.count('D'))
只需39毫秒
虽然我理解的代码越少越好,但我认为第二个会循环4次,我是不是误解了
谢谢两个代码示例的算法复杂度都是
O(n)
,但您不应该被大O所愚弄,因为它只显示了趋势。O(n)
算法的执行时间可以表示为C*n
,其中C
是常数,这取决于许多因素
对于.count()
code,您需要在string\u count()
C函数中执行4个循环,但是C函数速度很快。它还使用了一些先进的算法,比如。这里只执行字符串搜索,最大限度地减少了解释器的开销
在纯Python代码中,您只需要一个循环,但每次迭代都需要执行更多的低级代码,因为Python是解释语言*。例如,您正在为循环的每次迭代创建新的unicode或string对象,而创建对象是一项非常昂贵的操作。由于整数对象是不可变的,所以需要为每个计数器重新创建它们
*假设您使用的是CPython解释器,这几乎是默认的这里有一些
timeit
代码显示各种方法的速度,使用所有4个键计数相等的完美数据和每个键数量大致相等的随机数据
#!/usr/bin/env python3
''' Test speeds of various algorithms that check
if a sequence of U, D, L, R moves make a closed circle.
See https://stackoverflow.com/q/46568696/4014959
Written by PM 2Ring 2017.10.05
'''
from timeit import Timer
from random import seed, choice, shuffle
from collections import Counter, defaultdict
def judge_JH0(moves):
l = r = u = d = 0
for i in moves:
if i == 'L':
l += 1
if i == 'D':
d += 1
if i == 'R':
r += 1
if i == 'U':
u += 1
return ((l-r) == 0) and ((u-d) == 0)
def judge_JH1(moves):
l = r = u = d = 0
for i in moves:
if i == 'L':
l += 1
elif i == 'D':
d += 1
elif i == 'R':
r += 1
elif i == 'U':
u += 1
return (l == r) and (u == d)
def judge_count(moves):
return ((moves.count('R') == moves.count('L')) and
(moves.count('U') == moves.count('D')))
def judge_counter(moves):
d = Counter(moves)
return (d['R'] == d['L']) and (d['U'] == d['D'])
def judge_dict(moves):
d = {}
for c in moves:
d[c] = d.get(c, 0) + 1
return ((d.get('R', 0) == d.get('L', 0)) and
(d.get('U', 0) == d.get('D', 0)))
def judge_defdict(moves):
d = defaultdict(int)
for c in moves:
d[c] += 1
return (d['R'] == d['L']) and (d['U'] == d['D'])
# All the functions
funcs = (
judge_JH0,
judge_JH1,
judge_count,
judge_counter,
judge_dict,
judge_defdict,
)
def verify(data):
print('Verifying...')
for func in funcs:
name = func.__name__
result = func(data)
print('{:20} : {}'.format(name, result))
print()
def time_test(data, loops=100):
timings = []
for func in funcs:
t = Timer(lambda: func(data))
result = sorted(t.repeat(3, loops))
timings.append((result, func.__name__))
timings.sort()
for result, name in timings:
print('{:20} : {}'.format(name, result))
print()
# Make some data
keys = 'DLRU'
seed(42)
size = 100
perfect_data = list(keys * size)
shuffle(perfect_data)
print('Perfect')
verify(perfect_data)
random_data = [choice(keys) for _ in range(4 * size)]
print('Random data stats:')
for k in keys:
print(k, random_data.count(k))
print()
verify(random_data)
loops = 1000
print('Testing perfect_data')
time_test(perfect_data, loops=loops)
print('Testing random_data')
time_test(random_data, loops=loops)
典型输出
Perfect
Verifying...
judge_JH0 : True
judge_JH1 : True
judge_count : True
judge_counter : True
judge_dict : True
judge_defdict : True
Random data stats:
D 89
L 100
R 101
U 110
Verifying...
judge_JH0 : False
judge_JH1 : False
judge_count : False
judge_counter : False
judge_dict : False
judge_defdict : False
Testing perfect_data
judge_counter : [0.11746118000155548, 0.11771785900054965, 0.12218693499744404]
judge_count : [0.12314812499971595, 0.12353860199800692, 0.12495016200409736]
judge_defdict : [0.20643479600403225, 0.2069275510002626, 0.20834802299941657]
judge_JH1 : [0.25801684000180103, 0.2689959089984768, 0.27642749399819877]
judge_JH0 : [0.36819701099739177, 0.37400564400013536, 0.40291943999909563]
judge_dict : [0.3991459790049703, 0.4004156189985224, 0.4040740730051766]
Testing random_data
judge_count : [0.061543637995782774, 0.06157537500257604, 0.06704995800100733]
judge_counter : [0.11995147699781228, 0.12068584300141083, 0.1207217440023669]
judge_defdict : [0.2096717179956613, 0.21544414199888706, 0.220649760995002]
judge_JH1 : [0.261116588000732, 0.26281095200101845, 0.2706491360004293]
judge_JH0 : [0.38465088899829425, 0.38476935599464923, 0.3921787180006504]
judge_dict : [0.40892754300148226, 0.4094729179996648, 0.4135226650032564]
这些计时是在Linux上运行Python 3.6.0的旧2GHz 32位机器上获得的
这里还有几个函数
def judge_defdictlist(moves):
d = defaultdict(list)
for c in moves:
d[c].append(c)
return (len(d['R']) == len(d['L'])) and (len(d['U']) == len(d['D']))
# Sort to groups in alphabetical order: DLRU
def judge_sort(moves):
counts = [sum(1 for _ in g) for k, g in groupby(sorted(moves))]
return (counts[0] == counts[3]) and (counts[1] == counts[2])
judge\u defdictlist
比judge\u defdict
慢,但比judge\u JH1
快,当然它比judge\u defdict
使用更多的RAM
judge\u sort
比judge\u JH0
慢,但比judge\u dict
快,.count
方法可以以C速度循环,这比显式Python循环快。但是您可以通过使用elif
稍微加快Python循环的速度,因为一旦找到匹配项,就不需要对给定的i
进行进一步的测试。以C速度循环4次(对于CPython)。此外,第二个版本将跳过(moves.count('U')==moves.count('D'))
part如果第一部分为false,因为和
短路。根据输入字符串和要计数的数量,集合。计数器可以为偶数faster@Chris_Rands是的,计数器
可以比.count
快,尽管它不能在数据不平衡时通过短路来节省时间。Se这是我对一些统计数据的回答。事实上,由于那些str
常量是不可变的,解释器在CPython中重用了相同的对象。有很多这样的优化,伙计们!我想我现在更明白了。这真是一个令人印象深刻的实验!我学到了很多,非常感谢!所以这也意味着LeetCode上的计时不是很准确比率。我将来应该做更多的体验,更好地了解它是如何工作的。Tnx!