Python计算大数错误_Python_Python 2.7

Python计算大数错误

python python-2.7

Python计算大数错误,python,python-2.7,Python,Python 2.7,最近我试着用Python计算加泰罗尼亚数字！我尝试了两种方法来计算加泰罗尼亚数字： dp = np.zeros(160) dp[0] = 1 for i in range(1, 100): for j in range(i): dp[i] += dp[j] * dp[i - j - 1] 及根据这些公式，应该得到相同的答案，但在我的计算机中，当n等于或大于31时，它们给出不同的结果例如，给定n=31，第一个实现将产生 14544636039226908 当第二个

最近我试着用Python计算加泰罗尼亚数字！我尝试了两种方法来计算加泰罗尼亚数字：

dp = np.zeros(160)
dp[0] = 1
for i in range(1, 100):
    for j in range(i):
        dp[i] += dp[j] * dp[i - j - 1]

及

根据这些公式，应该得到相同的答案，但在我的计算机中，当n等于或大于31时，它们给出不同的结果

例如，给定

n=31

，第一个实现将产生

14544636039226908

当第二个

14544636039226909

n越大，差异越大

那么，原因是什么？我如何处理这两个实现，使它们得到相同的（正确的）结果？

我很确定这只是一个除法问题，就像5/2得到2一样。另外，将其更改为float也不会修复它，因为float只保存少数（可能是10位？我不记得了）数字

您可以尝试先计算整个顶部，然后计算整个底部，然后将它们分开。通过这样做，您还可以检查%的问题，我不记得加泰罗尼亚数字是如何工作的，但它应该可以为您解决问题。我希望这会有所帮助。

使用整数除法（

在Python2.x中；

在Python3.x中；请参阅）

如果使用整数运算，Python可以处理任意长度的整数

我在Python 3.4中对此进行了测试：

>>> n = 31
>>> math.factorial(n * 2) // math.factorial(n) // math.factorial(n + 1)
14544636039226909

任何长度都可以：

>>> n = 3000
>>> math.factorial(n * 2) // math.factorial(n) // math.factorial(n + 1)
519462652919542881721365123011179975310102937604940266719385892606880110765316718891395071497514229126429925976055679251223128074749037
835401036449153787085998615080079472024510673995437465556202913988662201476481724554419588352460788248600870845757882846138810676725538
563107883030181266599172195406194674262178494218158106628185084640318133660880669410879631422165901582338980573378926964500556169385404
736100270128669761789892432503454091737948987203916800528049625631943853069946630768308689117691085645832918187925556506072761147675438
429882843604702193420613753732662694259398583327509305925877958076192508779774600671550059625449220766972323426048569573870742646138682
330665271970741737026351041002094725570021658043868050133870464978010862336227347394228402203592519509440711956260056901367528427111161
296369965071015622062369906953928825160542499316029260848901981705520546040735573456838161278143205046287274001985209051501791057064860
777924614712880895844889661062906810651227996795699200705689167041491295132678905362506739442596941049468768934515387686685216725429630
569388433843181310525905915079353425197760576036382793301451923253554632457764696533239230792374371551049829770586784317601794822668699
762524880276131689250405042237665587324829345738473826128110671929192283799781962486065016982222602138402014572024398921586637930463872
133232259555872008143437104541075975585105539708870387267774173630656199269799668692949281254988538412342931876350743005256155083395855
293674222742887729441736406441460871100319788599494948199980318713167545334283812660431840713561226653525108082181718879207846399491603
046897066186692086000900551598963656721594748873629207464689206076706897152859647808013130407215834207952366890322422542440601278699142
2249907274578524259056058561900439043252745600

使用

math.factorial

的实现也比迭代版本快几个数量级：

p1='''
dp = [1] + [0] * n;
for i in range(1,1+n):
    for j in range(i):
        dp[i] += dp[j] * dp[i - j - 1]
'''
p2='''
math.factorial(n * 2) // math.factorial(n) // math.factorial(n + 1)
'''

# benchmarks:

import timeit

>>> timeit.timeit(p1, 'n=300', number=1000)
14.639895505999448

>>> timeit.timeit(p2, 'import math; n=300', number=1000)
0.06054379299166612

>>> timeit.timeit(p1, 'n=3000', number=10)
207.88161920005223

>>> timeit.timeit(p2, 'import math; n=3000', number=10)
0.042887639498803765

避免

numpy

，避免浮点，只需让Python处理其本机整数：

dp = [0] * 160
dp[0] = 1
for i in range(1, 100):
    for j in range(i):
        dp[i] += dp[j] * dp[i - j - 1]

您将获得所需的结果：

>>> dp[31]
14544636039226909

原因是

numpy.zeros

使用

float

作为默认元素数据类型。您没有显示如何在第一个版本中检索结果编号，但我假设您将其强制转换为

int

，否则，您将看到结果为

1.45446360392e+16

或类似内容：

def您的_版本（n）：
dp=np.零（n+1）
dp[0]=1
对于范围（1，n+1）内的i：
对于范围（i）中的j：
dp[i]+=dp[j]*dp[i-j-1]
返回整数（dp[n]）

如果指定要在Numpy中使用整数，则舍入错误将被删除：使用

np.zeros（n+1，dtype=np.uint64）

，结果是正确的

顺便说一下，@dlask建议您避免使用Numpy是正确的。您可以找到相同公式的记忆版本，它们都比Numpy版本快：

# The recurrent formula, memoized
In [9]: %timeit catalan.catalan(31)
The slowest run took 7.65 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 249 ns per loop

# The factorial formula, memoized
In [10]: %timeit catalan.cat_direct(31)
The slowest run took 10.21 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 1.19 µs per loop

# The recurrent formula with Numpy
In [11]: %timeit catalan.your_version(31)
1000 loops, best of 3: 259 µs per loop

# The factorial version without memoization
In [12]: %timeit catalan.your_other_version(31)
100000 loops, best of 3: 6.98 µs per loop

哪个是正确答案？第一个还是第二个？

math.factorial

转换浮点和浮点，因此每当结果稍微小于整数（这是允许的！），就会引入一个小错误。第二个是正确的。我强烈建议阅读有关浮点数如何在内部表示的内容，这既有趣又有帮助。@RobFoley OP说阶乘版本给出了正确的答案请参阅注释。与将numpy数组更改为整数（int64）类型的效果相同：

dp=np.zeros（32，dtype=np.int64）

（还必须减小

的范围，以避免在较大的n值时溢出）@DanielRenshaw请小心，您的示例中的64位整数与Python的无限整数之间存在差异。Python计算

1与使用math.factorial
的方法相比，此方法速度非常慢。请参见我的答案。@mescalinum我确认您的解决方案速度要快得多。如果更改第二个，它可能会更快除法到乘法：math.factorial（n*2）/（math.factorial（n）*math.factorial（n+1））
。如果不计算两次n
的阶乘，则运算速度会更快。无论如何，我的主要意图是表明numpy对于这种类型的计算是完全无用的。
>>> dp[31]
14544636039226909

# The recurrent formula, memoized
In [9]: %timeit catalan.catalan(31)
The slowest run took 7.65 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 249 ns per loop

# The factorial formula, memoized
In [10]: %timeit catalan.cat_direct(31)
The slowest run took 10.21 times longer than the fastest. This could mean that an intermediate result is being cached
1000000 loops, best of 3: 1.19 µs per loop

# The recurrent formula with Numpy
In [11]: %timeit catalan.your_version(31)
1000 loops, best of 3: 259 µs per loop

# The factorial version without memoization
In [12]: %timeit catalan.your_other_version(31)
100000 loops, best of 3: 6.98 µs per loop