Python中二维数组的对数_Python

Python中二维数组的对数

python

Python中二维数组的对数,python,Python,我有一个叫做矩阵的二维数组。其中的每个矩阵的维数为1000 x 1000，由正值组成。现在，我想记录所有矩阵中的所有值（0除外）。如何在python中轻松实现这一点？我有以下代码可以实现我的目的，但了解python可以使这一点更加简单： newMatrices = [] for matrix in matrices: newMaxtrix = [] for row in matrix: newRow = [] for value in row:

我有一个叫做矩阵的二维数组。其中的每个矩阵的维数为

1000 x 1000

，由正值组成。现在，我想记录所有矩阵中的所有值（0除外）。
如何在python中轻松实现这一点？
我有以下代码可以实现我的目的，但了解python可以使这一点更加简单：

newMatrices = []
for matrix in matrices:
    newMaxtrix = []
    for row in matrix:
        newRow = []
        for value in row:
            if value > 0:
                newRow.append(np.log(value))
            else:
                newRow.append(value)
        newMaxtrix.append(newRow)
    newMatrices.append(newMaxtrix)

您可以将其转换为numpy数组，并使用

numpy.log

计算值

对于0值，结果将是

-Inf

。之后，您可以将其转换回列表，并将

-Inf

替换为0

或者可以在numpy中使用

where

例如：

res=where（arr！=0，log2（arr），0）

它将忽略所有零元素。

如@R.yan所述你可以试试这样的

import numpy as np

newMatrices = []
for matrix in matrices:
    newMaxtrix = []
    for row in matrix:
        newRow = []
        for value in row:
            if value > 0:
                newRow.append(np.log(value))
            else:
                newRow.append(value)
        newMaxtrix.append(newRow)
    newMatrices.append(newMaxtrix)

newArray = np.asarray(newMatrices)
logVal = np.log(newArray)

另一种选择是使用

numpy

：

arr = np.ndarray((1000,1000))
np.log.at(arr, np.nonzero(arr))

简单到

import numpy as np
newMatrices = [np.where(matrix != 0, np.log(matrix), 0) for matrix in matrices]

不需要担心行和列，numpy会处理它。当理解足够可读时，无需在

for

循环中显式迭代矩阵

编辑：我刚刚注意到OP有

log

，而不是

log2

。虽然@Amadan的答案肯定是正确的（而且要短得多/优雅得多），但在您的情况下，它可能不是最有效的（当然取决于输入），因为

np.where（）

将为每个匹配值生成一个整数索引。更有效的方法是生成布尔掩码。这有两个优点：（1）通常内存效率更高（2）掩码上的

[]

运算符通常比整数列表更快

为了说明这一点，我在玩具输入上重新实现了基于

np.where（）

和基于掩码的解决方案（但尺寸正确）。我还包括了一个基于

np.log.at（）

的解决方案，它的效率也很低

import numpy as np


def log_matrices_where(matrices):
    return [np.where(matrix > 0, np.log(matrix), 0) for matrix in matrices]


def log_matrices_mask(matrices):
    arr = np.array(matrices, dtype=float)
    mask = arr > 0
    arr[mask] = np.log(arr[mask])
    arr[~mask] = 0  # if the values are always positive this is not needed
    return [x for x in arr]


def log_matrices_at(matrices):
    arr = np.array(matrices, dtype=float)
    np.log.at(arr, arr > 0)
    arr[~(arr > 0)] = 0  # if the values are always positive this is not needed
    return [x for x in arr]


N = 1000
matrices = [
    np.arange((N * N)).reshape((N, N)) - N
    for _ in range(2)]

（进行一些理智检查，以确保我们正在做相同的事情）

还有我机器上的计时：

%timeit log_matrices_where(matrices)
# 33.8 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit log_matrices_mask(matrices)
# 11.9 ms ± 97 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit log_matrices_at(matrices)
# 153 ms ± 831 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

编辑：另外包括

np.log.at（）

解决方案和关于将未定义

log

的值归零的注释

“正值”表示没有零。你是说“非负值”吗？在中很好地使用numpy的

，同样：-）+1这可能效率很低，因为np.where（）
在内部处理整数。正如我在回答中详细说明的那样，布尔掩码方法通常要快得多。我更多的是为了程序员的舒适，而不是速度（对于许多用途来说，11ms和33ms之间的差异并不是非常重要）。如果我们保证没有负数并且不想使用零，就不需要arr[~mask]=0。
%timeit log_matrices_where(matrices)
# 33.8 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit log_matrices_mask(matrices)
# 11.9 ms ± 97 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit log_matrices_at(matrices)
# 153 ms ± 831 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)