将矩阵中位置低于0的所有元素转换为0（Python）_Python_Python 3.x_Numpy_Linear Algebra

将矩阵中位置低于0的所有元素转换为0（Python）

python python-3.x numpy

将矩阵中位置低于0的所有元素转换为0（Python）,python,python-3.x,numpy,linear-algebra,Python,Python 3.x,Numpy,Linear Algebra,这是一个矩阵： matrix = [[1, 1, 1, 0], [0, 5, 0, 1], [2, 1, 3, 10]] 我想将位置低于0的所有元素更改为0（在同一列上）由此产生的矩阵将是： matrix = [[1, 1, 1, 0], [0, 5, 0, 0], [0, 1, 0, 0]] 我已经试过了。报税表是空的将numpy导入为np def变换（矩阵）： newmatrix=np.asarr

这是一个矩阵：

matrix = [[1, 1, 1, 0], 
          [0, 5, 0, 1], 
          [2, 1, 3, 10]]

我想将位置低于0的所有元素更改为0（在同一列上）

由此产生的矩阵将是：

matrix = [[1, 1, 1, 0], 
          [0, 5, 0, 0], 
          [0, 1, 0, 0]]

我已经试过了。报税表是空的

将numpy导入为np
def变换（矩阵）：
newmatrix=np.asarray（矩阵）
i=0
j=0
对于范围（0，len（矩阵[0]）-1）内的j：
而i

这是一个简单的（尽管没有优化）算法：
import numpy as np
from numba import jit

m = np.array([[1, 1, 1, 0], 
              [0, 5, 0, 1], 
              [2, 1, 3, 10]])

@jit(nopython=True)
def zeroer(m):
    a, b = m.shape
    for j in range(b):
        for i in range(a):
            if m[i, j] == 0:
                m[i:, j] = 0
                break
    return m

zeroer(m)

# [[1 1 1 0]
#  [0 5 0 0]
#  [0 1 0 0]]

方法1（原件）
然后，通过您的示例数据，我得到以下mat
：
In [195]: matrix = [[1, 1, 1, 0], 
     ...:           [0, 5, 0, 1], 
     ...:           [2, 1, 3, 10]]

In [196]: transform(matrix)
Out[196]: 
array([[1, 1, 1, 0],
       [0, 5, 0, 0],
       [0, 1, 0, 0]])

方法2（进一步优化）
方法3（更优化）
解释
让我们看一下主语句（在方法1中）：
我们可以将其分为几个“基本”操作：
创建一个包含False
（数值0
）的布尔掩码，其中mat
的元素为0
，而True
（数值1
）的元素为非零：
mask1 = np.not_equal(mat, 0)


使用数值False
为0的事实，使用函数（可以在这里找到一个很好的解释：）
由于1*1==1
和0*0
或0*1
是0
，此“掩码”的所有元素将是0
或1
。由于产品沿列的“累积性质”（因此axis=0
），它们将仅在mask1
为零且低于（！）的位置0
）
现在，我们要将mat
中与mask2
中的0
对应的那些元素设置为0
。为此，我们创建了一个布尔掩码，该掩码为True
，其中mask2
为0
而False
位于其他位置。通过将逻辑（或二进制）NOT应用于mask2
，可以轻松实现这一点：
mask3 = np.logical_not(mask2)

这里使用“逻辑”NOT创建布尔数组，因此我们避免显式类型转换
最后，我们使用选择需要设置为0
的mat
元素，并将其设置为0
：
mat[mask3] = 0



可选优化
如果您想一想，我们可以取消步骤3和步骤4，如果我们执行以下操作：
mask2 = mask1.cumprod(axis=0, dtype=np.bool) #convert result to boolean type 
mat *= mask2 # combined step 3&4

有关完整的实现，请参见上面的“方法2”部分
演出
另外还有几个答案使用了numpy.ufunc.acculate（）
。从根本上说，所有这些方法都围绕着这样一个理念，即0
是一个“特殊”值，即0*anything==0
，或者在@DSM的答案中，False=0cumprod
方法的一个变体是使用累积最小值（或最大值）。我稍微喜欢这个，因为如果您愿意，您可以使用它来避免任何超出比较范围的算术运算，尽管这很难让人激动：
In [37]:  m
Out[37]: 
array([[ 1,  1,  1,  0],
       [ 0,  5,  0,  1],
       [ 2,  1,  3, 10]])

In [38]: m * np.minimum.accumulate(m != 0)
Out[38]: 
array([[1, 1, 1, 0],
       [0, 5, 0, 0],
       [0, 1, 0, 0]])

In [39]: np.where(np.minimum.accumulate(m != 0), m, 0)
Out[39]: 
array([[1, 1, 1, 0],
       [0, 5, 0, 0],
       [0, 1, 0, 0]])

@AGNGazer解决方案的更优化版本，使用np.logical_和.accumulate
以及整数的隐式布尔转换（不需要大量乘法）
时间：
%timeit transform2(m) # AGN's solution
The slowest run took 44.73 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.93 µs per loop

%timeit transform(m)
The slowest run took 9.00 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.99 µs per loop

m = np.random.randint(0,5,(100,100))

%timeit transform(m)
The slowest run took 6.03 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 43.9 µs per loop

%timeit transform2(m)
The slowest run took 4.09 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 50.4 µs per loop

看起来大约有15%的加速。
“我想将0以下的所有元素都更改为0（在同一行上）。”但所有元素都高于0。那么为什么矩阵应该改变呢？应该是：matrix=[[1，1，1，0]，[0，5，0，-1]，-2，1，-3，-10]，
？事实上我需要将列中0以下的所有数字乘以0@FritzFABO我已经把你的问题编辑成我认为应该是的。你能看一下吗？为什么要返回print（）
？它总是无
@kmario23不是真的：binary\u Not
应用于Not_equal（）.cumprod（）
@kmario23我以一种更明确的方式重写了主语句。或者更详细地说，在它周围放一个偏执的句子：mat[~（mat！=0）.cumprod（axis=0）.astype（np.bool））=0
，以避免混淆。很好的解决方案，顺便说一句：）@FritzFABO基本上~
是二进制的
，但当参数是布尔类型时，可以用它代替逻辑的
（mat！=0）
与不相等（）
相同-请参阅。对于cumprod（）
-请参见@FritzFABO，我添加了1）详细解释和2）进一步简化/优化的算法。我在我的答案中添加了一个计时比较部分，它显示，如发布的，您的方法比所有其他方法都慢，除了我的方法1，它是基于numpy
的方法中最慢的方法。m*np.minimum.acculate（m，dtype=np.bool）也可以使用。
mask3 = np.logical_not(mask2)

mat[mask3] = 0

mask2 = mask1.cumprod(axis=0, dtype=np.bool) #convert result to boolean type 
mat *= mask2 # combined step 3&4

In [1]: import sys
    ...: import numpy as np
    ...: 

In [2]: print(sys.version)
    ...: 
3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]

In [3]: print(np.__version__)
    ...: 
1.12.1

In [4]: # Method 1 (Original)
    ...: def transform1(matrix):
    ...:     mat = np.asarray(matrix)
    ...:     mat[np.logical_not(np.not_equal(mat, 0).cumprod(axis=0))] = 0
    ...:     return mat
    ...: 

In [5]: # Method 2:
    ...: def transform2(matrix):
    ...:     mat = np.asarray(matrix)
    ...:     mat *= (mat != 0).cumprod(axis=0, dtype=np.bool)
    ...:     return mat
    ...: 

In [6]: # @DSM method:
    ...: def transform_DSM(matrix):
    ...:     mat = np.asarray(matrix)
    ...:     mat *= np.minimum.accumulate(mat != 0)
    ...:     return mat
    ...: 

In [7]: # @DanielF method:
    ...: def transform_DanielF(matrix):
    ...:     mat = np.asarray(matrix)
    ...:     mat[~np.logical_and.accumulate(mat, axis = 0)] = 0
    ...:     return mat
    ...: 

In [8]: # Optimized @DanielF method:
    ...: def transform_DanielF_optimized(matrix):
    ...:     mat = np.asarray(matrix)
    ...:     mat *= np.logical_and.accumulate(mat, dtype=np.bool)
    ...:     return mat
    ...: 

In [9]: matrix = np.random.randint(0, 20000, (20000, 20000))

In [10]: %timeit -n1 transform1(matrix)
22.1 s ± 241 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [11]: %timeit -n1 transform2(matrix)
9.29 s ± 185 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [12]: %timeit -n1 transform3(matrix)
9.23 s ± 180 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [13]: %timeit -n1 transform_DSM(matrix)
9.24 s ± 195 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [14]: %timeit -n1 transform_DanielF(matrix)
10.3 s ± 219 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [15]: %timeit -n1 transform_DanielF_optimized(matrix)
9.27 s ± 187 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [37]:  m
Out[37]: 
array([[ 1,  1,  1,  0],
       [ 0,  5,  0,  1],
       [ 2,  1,  3, 10]])

In [38]: m * np.minimum.accumulate(m != 0)
Out[38]: 
array([[1, 1, 1, 0],
       [0, 5, 0, 0],
       [0, 1, 0, 0]])

In [39]: np.where(np.minimum.accumulate(m != 0), m, 0)
Out[39]: 
array([[1, 1, 1, 0],
       [0, 5, 0, 0],
       [0, 1, 0, 0]])

def transform(matrix):
    mat = np.asarray(matrix)
    mat[~np.logical_and.accumulate(mat, axis = 0)] = 0
    return mat

transform(m)
Out:
array([[1, 1, 1, 0],
       [0, 5, 0, 0],
       [0, 1, 0, 0]])

%timeit transform2(m) # AGN's solution
The slowest run took 44.73 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.93 µs per loop

%timeit transform(m)
The slowest run took 9.00 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.99 µs per loop

m = np.random.randint(0,5,(100,100))

%timeit transform(m)
The slowest run took 6.03 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 43.9 µs per loop

%timeit transform2(m)
The slowest run took 4.09 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 50.4 µs per loop