Luhn'的意外性能;python中的s算法

Luhn'的意外性能;python中的s算法,python,python-3.x,performance,ipython,luhn,Python,Python 3.x,Performance,Ipython,Luhn,我已经实现了Luhn算法校验和部分的两个python版本。除了如何计算第二次求和外,这些代码片段几乎完全相同。它与通常的实现不同,它计算总和,然后用校正因子更新和(如图所示) 然后,我在三个示例上使用timeit对Ipython终端中的代码计时 from random import randint randoms5 = [randint(0, 9) for _ in range(10**5)] randoms6 = [randint(0, 9) for _ in range(10**6)] ra

我已经实现了Luhn算法校验和部分的两个python版本。除了如何计算第二次求和外,这些代码片段几乎完全相同。它与通常的实现不同,它计算总和,然后用校正因子更新和(如图所示)

然后,我在三个示例上使用timeit对Ipython终端中的代码计时

from random import randint
randoms5 = [randint(0, 9) for _ in range(10**5)]
randoms6 = [randint(0, 9) for _ in range(10**6)]
randoms7 = [randint(0, 9) for _ in range(10**7)]
结果如下:

>>> %timeit manual_loop(randoms5)
2.69 ms ± 17 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit manual_loop(randoms6)
29.3 ms ± 535 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit manual_loop(randoms7)
311 ms ± 1.52 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

>>> %timeit builtin_loop(randoms5)
3.31 ms ± 146 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit builtin_loop(randoms6)
34.8 ms ± 1.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit builtin_loop(randoms7)
337 ms ± 5.69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
有什么好处?我希望python的内置sum能够提供比我自己做循环更好的性能,特别是对于这种大小的列表

注意:我省略了其他变量,例如手动进行两次求和,以及交换使用内置求和的求和,因为它们的时间差得多


编辑:根据请求,我已使用固定种子重新运行测试

import random
random.seed(42)
randoms5 = [random.randint(0, 9) for _ in range(10**5)]
randoms6 = [random.randint(0, 9) for _ in range(10**6)]
randoms7 = [random.randint(0, 9) for _ in range(10**7)]
随机试验的结果更令人期待,但在更大的试验中仍然很奇怪

>>> %timeit manual_loop(randoms5)
3.36 ms ± 346 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit builtin_loop(randoms5)
3.23 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit manual_loop(randoms6)
31.2 ms ± 890 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit builtin_loop(randoms6)
35.4 ms ± 2.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit manual_loop(randoms7)
311 ms ± 5.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

>>> %timeit builtin_loop(randoms7)
341 ms ± 8.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

您的
sum
调用无法避免解释循环<代码>求和本身不会被解释,但它循环的生成器表达式会被解释。生成器表达式比常规Python循环有更高的开销,这是由于反复挂起和恢复生成器堆栈框架的所有工作。

使用固定种子运行,以便我可以复制。我想看看确切的数字data@MadPhysicist当然,我会用一个固定的种子编辑结果。另一个注意事项是:在我的机器上,使用numpy对该算法进行矢量化将时间缩短到0.12毫秒。我制作了类型为
np.array
differences=np.array([0,1,2,3,4,-4,-3,-2,-1,0])的X;总计=np.sum(xs)+np.sum(差值[xs[len(xs)%2::2]];返回总数%10
>>> %timeit manual_loop(randoms5)
3.36 ms ± 346 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit builtin_loop(randoms5)
3.23 ms ± 76.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit manual_loop(randoms6)
31.2 ms ± 890 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit builtin_loop(randoms6)
35.4 ms ± 2.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

>>> %timeit manual_loop(randoms7)
311 ms ± 5.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

>>> %timeit builtin_loop(randoms7)
341 ms ± 8.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)