从numpy中的联合PMF计算条件概率太慢。思想？（python numpy）_Python_Numpy_Probability_Arrays_Recarray

从numpy中的联合PMF计算条件概率太慢。思想？（python numpy）

python numpy arrays

从numpy中的联合PMF计算条件概率太慢。思想？（python numpy）,python,numpy,probability,arrays,recarray,Python,Numpy,Probability,Arrays,Recarray,我有一个连接概率质量函数数组，有形状，例如（1,2,3,4,5,6），我想计算概率表，以一些维度的值为条件（导出CPT），用于决策我当时想到的代码如下（输入是{'variable_1'：value_1，'variable_2'：value_2…}形式的字典“vdict”）所以，我现在做的是：我将变量转换为cpt中相应的维度我将第0轴与之前找到的轴交换我将整个0轴替换为所需的值我将尺寸标注放回其原始轴现在，问题是，为了执行步骤2，我必须（a.）计算一个子阵列（b.）将其放入列表中

我有一个连接概率质量函数数组，有形状，例如（1,2,3,4,5,6），我想计算概率表，以一些维度的值为条件（导出CPT），用于决策

我当时想到的代码如下（输入是{'variable_1'：value_1，'variable_2'：value_2…}形式的字典“vdict”）

所以，我现在做的是：

我将变量转换为cpt中相应的维度

我将第0轴与之前找到的轴交换

我将整个0轴替换为所需的值

我将尺寸标注放回其原始轴

现在，问题是，为了执行步骤2，我必须（a.）计算一个子阵列（b.）将其放入列表中，并再次将其转换为数组，这样我就有了我的新数组

问题是，粗体的东西意味着我创建了新的对象，而不是仅仅使用对旧对象的引用，如果d非常大（发生在我身上），并且多次调用使用d的方法（再次发生在我身上），整个结果非常慢

那么，有没有人想出了一个主意，可以使这段代码变得微妙，运行得更快？也许可以让我计算出适当的条件

注意：我必须保持原始的轴顺序（或者至少要确保在删除轴时如何将变量更新为维度字典）。我不想使用自定义的数据类型。

好的，在玩了一点numpy的就地数组操作后，我自己找到了答案

将循环中的最后3行更改为：

    d = conditionalize(d, dim, val)

其中条件化定义为：

    def conditionalize(arr, dim, val):
        arr = arr.swapaxes(dim, 0)
        shape = arr.shape[1:]       # shape of the sub-array when we omit the desired dimension.
        count = array(shape).prod() # count of elements omitted the desired dimension.
        arr = arr.reshape(array(arr.shape).prod()) # flatten the array in-place.
        arr = arr[val*count:(val+1)*count] # take the needed elements
        arr = arr.reshape((1,)+shape) # the desired sub-array shape.
        arr = arr. swapaxes(0, dim)   # fix dimensions

        return arr

这使得我的程序的执行时间从15分钟减少到6秒。巨大的收益

我希望这对遇到同样问题的人有所帮助。

好的，在玩了一点numpy的就地数组操作后，我自己找到了答案

将循环中的最后3行更改为：

    d = conditionalize(d, dim, val)

其中条件化定义为：

    def conditionalize(arr, dim, val):
        arr = arr.swapaxes(dim, 0)
        shape = arr.shape[1:]       # shape of the sub-array when we omit the desired dimension.
        count = array(shape).prod() # count of elements omitted the desired dimension.
        arr = arr.reshape(array(arr.shape).prod()) # flatten the array in-place.
        arr = arr[val*count:(val+1)*count] # take the needed elements
        arr = arr.reshape((1,)+shape) # the desired sub-array shape.
        arr = arr. swapaxes(0, dim)   # fix dimensions

        return arr

这使得我的程序的执行时间从15分钟减少到6秒。巨大的收益

我希望这能帮助遇到同样问题的人