统计：Python中的组合_Python_Statistics_Combinations

统计：Python中的组合

python statistics

统计：Python中的组合,python,statistics,combinations,Python,Statistics,Combinations,我需要用Python计算组合数（nCr），但在math、numpy或stat库中找不到这样做的函数。类似于以下类型的函数： comb = calculate_combinations(n, r) 我需要的是可能的组合数，而不是实际的组合数，因此，itertools.combinations对我不感兴趣最后，我想避免使用阶乘，因为我将要计算的组合的数字可能会变得太大，阶乘将是巨大的这似乎是一个非常容易回答的问题，但是我被淹没在关于生成所有实际组合的问题中，这不是我想要的。请参见（scipy的

我需要用Python计算组合数（nCr），但在

math

、

numpy

或

stat

库中找不到这样做的函数。类似于以下类型的函数：

comb = calculate_combinations(n, r)

我需要的是可能的组合数，而不是实际的组合数，因此，

itertools.combinations

对我不感兴趣

最后，我想避免使用阶乘，因为我将要计算的组合的数字可能会变得太大，阶乘将是巨大的

这似乎是一个非常容易回答的问题，但是我被淹没在关于生成所有实际组合的问题中，这不是我想要的。

请参见（scipy的旧版本中的scipy.misc.comb）。当

exact

为False时，它使用gammaln函数获得良好的精度，而不需要花费太多时间。在确切的情况下，它返回一个任意精度的整数，这可能需要很长时间来计算。

在谷歌代码上快速搜索（它使用以下公式）：

如果你想要精确的结果和速度，试试--

gmpy。comb

应该完全按照你的要求做，而且速度非常快（当然，作为

gmpy

的原始作者，我有偏见；-）。

为什么不自己写呢？这是一个单班轮或类似的：

from operator import mul    # or mul=lambda x,y:x*y
from fractions import Fraction

def nCk(n,k): 
  return int( reduce(mul, (Fraction(n-i, i+1) for i in range(k)), 1) )

测试-打印Pascal三角形：

>>> for n in range(17):
...     print ' '.join('%5d'%nCk(n,k) for k in range(n+1)).center(100)
...     
                                                   1                                                
                                                1     1                                             
                                             1     2     1                                          
                                          1     3     3     1                                       
                                       1     4     6     4     1                                    
                                    1     5    10    10     5     1                                 
                                 1     6    15    20    15     6     1                              
                              1     7    21    35    35    21     7     1                           
                           1     8    28    56    70    56    28     8     1                        
                        1     9    36    84   126   126    84    36     9     1                     
                     1    10    45   120   210   252   210   120    45    10     1                  
                  1    11    55   165   330   462   462   330   165    55    11     1               
               1    12    66   220   495   792   924   792   495   220    66    12     1            
            1    13    78   286   715  1287  1716  1716  1287   715   286    78    13     1         
         1    14    91   364  1001  2002  3003  3432  3003  2002  1001   364    91    14     1      
      1    15   105   455  1365  3003  5005  6435  6435  5005  3003  1365   455   105    15     1   
    1    16   120   560  1820  4368  8008 11440 12870 11440  8008  4368  1820   560   120    16     1
>>>

PS.编辑以替换

int（舍入（减少（mul，（浮动（n-i）/（i+1）范围内的i），1））

使用

int（对于范围（k）内的i，减少（mul，（分数（n-i，i+1）），1））

，因此对于大n/k不会出错。这一个最初是用C++编写的，所以它可以被移植到C++中，用于有限的精确整数（例如，γ-IT64）。优点是（1）它只涉及整数运算，（2）它通过执行连续的乘法和除法对来避免整数值膨胀。我已经用Nas Banov的Pascal三角形测试了结果，它得到了正确的答案：

def choose(n,r):
  """Computes n! / (r! (n-r)!) exactly. Returns a python long int."""
  assert n >= 0
  assert 0 <= r <= n

  c = 1L
  denom = 1
  for (num,denom) in zip(xrange(n,n-r,-1), xrange(1,r+1,1)):
    c = (c * num) // denom
  return c

为了尽可能避免乘法溢出，我们将按照以下严格顺序进行计算，从左到右：

n / 1 * (n-1) / 2 * (n-2) / 3 * ... * (n-r+1) / r

我们可以证明按此顺序操作的整数算术是精确的（即没有舍入错误）。

数学定义的直译在很多情况下都是足够的（记住Python会自动使用大数算术）：

对于我测试的一些输入（例如n=1000 r=500），这比另一个（目前投票最高）答案中建议的一个线性

reduce

快10倍多。另一方面，它是由@J.F.Sebastian提供的snippit执行的。

如果您想要精确的结果，请使用。这似乎是最快的方法，不用动手

x = 1000000
y = 234050

%timeit scipy.misc.comb(x, y, exact=True)
1 loops, best of 3: 1min 27s per loop

%timeit gmpy.comb(x, y)
1 loops, best of 3: 1.97 s per loop

%timeit int(sympy.binomial(x, y))
100000 loops, best of 3: 5.06 µs per loop

使用动态规划，时间复杂度为Θ（n*m），空间复杂度为Θ（m）：

def二项式（n，k）：
“”（int，int）->int
|c（n-1，k-1）+c（n-1，k），如果0k
>>>二项式（9,2）
36
"""
c=[0]*（n+1）
c[0]=1
对于范围（1，n+1）内的i：
c[i]=1
j=i-1
当j>0时：
c[j]+=c[j-1]
j-=1
返回c[k]

对于相当大的输入，这可能是在纯python中实现的最快速度：

def choose(n, k):
    if k == n: return 1
    if k > n: return 0
    d, q = max(k, n-k), min(k, n-k)
    num =  1
    for n in xrange(d+1, n+1): num *= n
    denom = 1
    for d in xrange(1, q+1): denom *= d
    return num / denom

当n大于20时，直接公式产生大整数

因此，另一个回应是：

from math import factorial

reduce(long.__mul__, range(n-r+1, n+1), 1L) // factorial(r)

简短、准确、高效，因为这样可以避免python使用长整数来生成大整数

与scipy.special.comb相比，它更准确、更快：

 >>> from scipy.special import comb
 >>> nCr = lambda n,r: reduce(long.__mul__, range(n-r+1, n+1), 1L) // factorial(r)
 >>> comb(128,20)
 1.1965669823265365e+23
 >>> nCr(128,20)
 119656698232656998274400L  # accurate, no loss
 >>> from timeit import timeit
 >>> timeit(lambda: comb(n,r))
 8.231969118118286
 >>> timeit(lambda: nCr(128, 20))
 3.885951042175293

和sympy在一起很容易

import sympy

comb = sympy.binomial(n, r)

仅使用随Python分发的标准库：

如果您的程序有

的上限（比如

n>n
次），使用可以极大地提高性能：
from functools import lru_cache

@lru_cache(maxsize=None)
def nCr(n, r):
    return 1 if r == 0 or r == n else nCr(n - 1, r - 1) + nCr(n - 1, r)

构造缓存（隐式完成）需要花费O（N^2）
时间。对nCr
的任何后续调用都将在O（1）
中返回。您可以编写2个简单函数，实际速度大约是使用时的5-8倍。事实上，您不需要导入任何额外的包，而且该函数非常容易阅读。诀窍是使用记忆来存储以前计算的值，并使用
如果我们比较时间
from scipy.special import comb
%timeit comb(100,48)
>>> 100000 loops, best of 3: 6.78 µs per loop

%timeit ncr(100,48)
>>> 1000000 loops, best of 3: 1.39 µs per loop

这是使用内置记忆装饰器的@killerT2333代码
from functools import lru_cache

@lru_cache()
def factorial(n):
    """
    Calculate the factorial of an input using memoization
    :param n: int
    :rtype value: int
    """
    return 1 if n in (1, 0) else n * factorial(n-1)

@lru_cache()
def ncr(n, k):
    """
    Choose k elements from a set of n elements,
    n must be greater than or equal to k.
    :param n: int
    :param k: int
    :rtype: int
    """
    return factorial(n) / (factorial(k) * factorial(n - k))

print(ncr(6, 3))

这个功能非常优化
def nCk(n,k):
    m=0
    if k==0:
        m=1
    if k==1:
        m=n
    if k>=2:
        num,dem,op1,op2=1,1,k,n
        while(op1>=1):
            num*=op2
            dem*=op1
            op1-=1
            op2-=1
        m=num//dem
    return m

从Python 3.8开始，标准库现在包括计算二项式系数的函数：
数学梳（n，k）
这是从n个项目中选择k个项目而不重复的方法数
n！/（k！（n-k）！
：
这是一个有效的算法
for i = 1.....r

   p = p * ( n - i ) / i

print(p)

例如nCr（30,7）
=事实（30）/（事实（7）*事实（23））
=（30*29*28*27*26*25*24）/（1*2*3*4*5*6*7）
所以只需从1到r运行循环就可以得到结果

在python中：
n，r=5,2
p=n
对于范围（1，r）内的i：
p=p*（n-i）/i
其他：
p=p/（i+1）
印刷品（p）
事实上，gmpy2.comb（）
比我对代码的回答中的choose（）
快10倍：在itertools.compositions（范围（1000），2）：f（n，k）
其中f（）
是gmpy2.comb（）
或choose（）在python3上。因为你是这个包的作者，我会让你修复断开的链接，使其指向正确的位置。…@SeldomNeedy，code.google.com的链接是正确的位置（尽管该网站现在处于存档模式）。当然，从那里很容易找到github位置和PyPI位置，因为它链接到这两个位置！-）@亚历克斯马泰利很抱歉给你带来困惑。如果javascript被（选择性地）禁用，页面将显示404。我想这是为了阻止流氓AIs如此轻松地合并归档的Google代码项目源代码？+1建议编写一些简单的东西，使用reduce，以及使用pascal tria的酷演示
import itertools

def nCk(n, k):
    return len(list(itertools.combinations(range(n), k)))

from functools import lru_cache

@lru_cache(maxsize=None)
def nCr(n, r):
    return 1 if r == 0 or r == n else nCr(n - 1, r - 1) + nCr(n - 1, r)

# create a memoization dictionary
memo = {}
def factorial(n):
    """
    Calculate the factorial of an input using memoization
    :param n: int
    :rtype value: int
    """
    if n in [1,0]:
        return 1
    if n in memo:
        return memo[n]
    value = n*factorial(n-1)
    memo[n] = value
    return value

def ncr(n, k):
    """
    Choose k elements from a set of n elements - n must be larger than or equal to k
    :param n: int
    :param k: int
    :rtype: int
    """
    return factorial(n)/(factorial(k)*factorial(n-k))

from scipy.special import comb
%timeit comb(100,48)
>>> 100000 loops, best of 3: 6.78 µs per loop

%timeit ncr(100,48)
>>> 1000000 loops, best of 3: 1.39 µs per loop

from functools import lru_cache

@lru_cache()
def factorial(n):
    """
    Calculate the factorial of an input using memoization
    :param n: int
    :rtype value: int
    """
    return 1 if n in (1, 0) else n * factorial(n-1)

@lru_cache()
def ncr(n, k):
    """
    Choose k elements from a set of n elements,
    n must be greater than or equal to k.
    :param n: int
    :param k: int
    :rtype: int
    """
    return factorial(n) / (factorial(k) * factorial(n - k))

print(ncr(6, 3))

def nCk(n,k):
    m=0
    if k==0:
        m=1
    if k==1:
        m=n
    if k>=2:
        num,dem,op1,op2=1,1,k,n
        while(op1>=1):
            num*=op2
            dem*=op1
            op1-=1
            op2-=1
        m=num//dem
    return m

import math
math.comb(10, 5) # 252

for i = 1.....r

   p = p * ( n - i ) / i

print(p)