将循环内具有布尔索引的python函数转换为cython_Python_Cython

将循环内具有布尔索引的python函数转换为cython

python

将循环内具有布尔索引的python函数转换为cython,python,cython,Python,Cython,下面我将用纯python编写一个函数，我想将其Cythonize def do_stuff(M_i, C_i): return M_i.dot(C_i).dot(M_i) def foo(M, C): ''' M : np.array N x J matrix C : np.array J x J matrix ''' N = M.shape[0] tot = 0 for i in range

下面我将用纯python编写一个函数，我想将其Cythonize

def do_stuff(M_i, C_i):
    return M_i.dot(C_i).dot(M_i)

def foo(M, C):
    '''
    M : np.array
        N x J matrix
    C : np.array
        J x J matrix
    '''

    N = M.shape[0]

    tot = 0

    for i in range(N):
        nonmiss = ~np.isnan(M[i, :])
        M_i = M[i, nonmiss] # select non empty components of M
        C_i = C[nonmiss, :] # select corresponding components of C
        C_i = C_i[:, nonmiss] # select corresponding components of C

        tot = tot + do_stuff(M_i, C_i)

    return tot

假设我知道如何对函数

do\u stuff

进行循环化。我感兴趣的实际的

do_stuff

函数比上面的函数更复杂，但我想我会提供一个例子。除了矩阵乘法外，real

do_stuff

函数还计算行列式并求逆

我的主要问题是创建

mi

和

ci

子向量和子矩阵。我不确定我能不能用Cython做同样的布尔索引。如果我能，我不知道怎么做。但我可以从我知道的一点赛昂开始

def foo_c(double[:, ::1] M, double[:, ::1] C):

    cdef int N = M.shape[0]
    cdef double tot = 0
    ...

    for i in range(N):
        ...
        tot = tot + do_stuff_c(M_i, C_i)

    return tot

在这里您可能不会获得太多的速度，因为Numpy中的布尔索引无论如何都是用C实现的，所以应该相当快。您要避免的主要事情是创建一些不必要的中间过程（这涉及内存分配，因此可能会很慢）

您要做的是为最大可能大小的

M_i

和

C_i

创建临时数组（即

或

JxJ

）。当您在

isnan（mui）

中迭代时，您会跟踪实际存储了多少值。然后在最后，您将

M_i

和

C_i

仅修剪到您使用过的零件：

未测试代码：

for i in range(N):
    filled_count_j = 0
    M_i = np.empty((M.shape[1],))
    C_i = np.empty((M.shape[1],M.shape[1]))
    for j in range(M.shape[1]):
        if ~isnan(M[i,j]):
            M_i[filled_count] = M[i,j]

            filled_count_k = 0
            for k in range(M.shape[1]):
                if ~isnan(M[i,k]):
                    C_i[filled_count_j,filled_count_k] = C[j,k]
                    filled_count_k += 1
            filled_count_j += 1
     M_i = M_i[:filled_count_j]
     C_i = C_i[:filled_count_j,:filled_count_j]

如果您已经准备好使用纯数组，那么通过对所有代码进行Cythonization，您可以获得一些速度（可能是1.2x-5x）。但它可能不值得你花那么多时间，失去了numpy的便利。这取决于项目的优先级以及do_stuff（）函数的重量