Python 创建具有时间步长和多个功能的新阵列，例如用于LSTM_Python_Arrays_Performance_Numpy

Python 创建具有时间步长和多个功能的新阵列，例如用于LSTM

python arrays performance numpy

Python 创建具有时间步长和多个功能的新阵列，例如用于LSTM,python,arrays,performance,numpy,Python,Arrays,Performance,Numpy,您好，我正在使用numpy为LSTM创建一个具有时间步长和多个功能的新阵列我已经研究了许多使用跨步和重塑的方法，但没有找到有效的解决方案这是一个解决玩具问题的函数，但是我有30000个示例，每个示例都有100个特性 def make_timesteps(a, timesteps): array = [] for j in np.arange(len(a)): unit = [] for i in rang

您好，我正在使用numpy为LSTM创建一个具有时间步长和多个功能的新阵列

我已经研究了许多使用跨步和重塑的方法，但没有找到有效的解决方案

这是一个解决玩具问题的函数，但是我有30000个示例，每个示例都有100个特性

    def make_timesteps(a, timesteps):
        array = []
        for j in np.arange(len(a)):
            unit = []
            for i in range(timesteps):
                unit.append(np.roll(a, i, axis=0)[j])
            array.append(unit)
        return np.array(array)

inArr=np.array（[[1,2]，[3,4]，[5,6]]）

inArr.shape=>（3,2）

outArr=make\u时间步（inArr，2）

outArr.shape=>（3,2,2）

=>正确

有没有更有效的方法（一定有！！）能有人帮忙吗

一个技巧是将最后的

L-1

行附加到数组之外，并将它们附加到数组的开头。那么，这将是一个使用非常高效的。对于想知道这个技巧的成本的人来说，正如我们稍后通过计时测试所看到的，它就像没有一样好

通向最终目标的技巧将支持代码中的向前和向后跨步，看起来是这样的-

向后跨步：

def strided_axis0_backward(inArr, L = 2):
    # INPUTS :
    # a : Input array
    # L : Length along rows to be cut to create per subarray

    # Append the last row to the start. It just helps in keeping a view output.
    a = np.vstack(( inArr[-L+1:], inArr ))

    # Store shape and strides info
    m,n = a.shape
    s0,s1 = a.strides

    # Length of 3D output array along its axis=0
    nd0 = m - L + 1

    strided = np.lib.stride_tricks.as_strided    
    return strided(a[L-1:], shape=(nd0,L,n), strides=(s0,-s0,s1))

def strided_axis0_forward(inArr, L = 2):
    # INPUTS :
    # a : Input array
    # L : Length along rows to be cut to create per subarray

    # Append the last row to the start. It just helps in keeping a view output.
    a = np.vstack(( inArr , inArr[:L-1] ))

    # Store shape and strides info
    m,n = a.shape
    s0,s1 = a.strides

    # Length of 3D output array along its axis=0
    nd0 = m - L + 1

    strided = np.lib.stride_tricks.as_strided    
    return strided(a[:L-1], shape=(nd0,L,n), strides=(s0,s0,s1))

向前跨步：

def strided_axis0_backward(inArr, L = 2):
    # INPUTS :
    # a : Input array
    # L : Length along rows to be cut to create per subarray

    # Append the last row to the start. It just helps in keeping a view output.
    a = np.vstack(( inArr[-L+1:], inArr ))

    # Store shape and strides info
    m,n = a.shape
    s0,s1 = a.strides

    # Length of 3D output array along its axis=0
    nd0 = m - L + 1

    strided = np.lib.stride_tricks.as_strided    
    return strided(a[L-1:], shape=(nd0,L,n), strides=(s0,-s0,s1))

def strided_axis0_forward(inArr, L = 2):
    # INPUTS :
    # a : Input array
    # L : Length along rows to be cut to create per subarray

    # Append the last row to the start. It just helps in keeping a view output.
    a = np.vstack(( inArr , inArr[:L-1] ))

    # Store shape and strides info
    m,n = a.shape
    s0,s1 = a.strides

    # Length of 3D output array along its axis=0
    nd0 = m - L + 1

    strided = np.lib.stride_tricks.as_strided    
    return strided(a[:L-1], shape=(nd0,L,n), strides=(s0,s0,s1))

样本运行-

In [42]: inArr
Out[42]: 
array([[1, 2],
       [3, 4],
       [5, 6]])

In [43]: strided_axis0_backward(inArr, 2)
Out[43]: 
array([[[1, 2],
        [5, 6]],

       [[3, 4],
        [1, 2]],

       [[5, 6],
        [3, 4]]])

In [44]: strided_axis0_forward(inArr, 2)
Out[44]: 
array([[[1, 2],
        [3, 4]],

       [[3, 4],
        [5, 6]],

       [[5, 6],
        [1, 2]]])

运行时测试-

In [53]: inArr = np.random.randint(0,9,(1000,10))

In [54]: %timeit make_timesteps(inArr, 2)
    ...: %timeit strided_axis0_forward(inArr, 2)
    ...: %timeit strided_axis0_backward(inArr, 2)
    ...: 
10 loops, best of 3: 33.9 ms per loop
100000 loops, best of 3: 12.1 µs per loop
100000 loops, best of 3: 12.2 µs per loop

In [55]: %timeit make_timesteps(inArr, 10)
    ...: %timeit strided_axis0_forward(inArr, 10)
    ...: %timeit strided_axis0_backward(inArr, 10)
    ...: 
1 loops, best of 3: 152 ms per loop
100000 loops, best of 3: 12 µs per loop
100000 loops, best of 3: 12.1 µs per loop

In [56]: 152000/12.1  # Speedup figure
Out[56]: 12561.98347107438

即使我们在输出中增加子阵列的长度，

跨步轴0

的计时也保持不变。这正好向我们展示了

大步前进的巨大好处，当然还有比原来的loopy版本更疯狂的加速
正如一开始所承诺的，以下是使用np.vstack
-
In [417]: inArr = np.random.randint(0,9,(1000,10))

In [418]: L = 10

In [419]: %timeit np.vstack(( inArr[-L+1:], inArr ))
100000 loops, best of 3: 5.41 µs per loop

计时支持堆叠是一个非常有效的想法。
非常感谢-这真的很有帮助，我以前看到过，但直到您的示例和链接，我才理解它！为了获得相同的顺序，我使用将第一行添加到最后一行，然后在轴1上进行np.flip。我已经编辑了这个问题，以显示我的最终代码。@nickyzee我想我不明白你为什么需要flip
。您的make_timesteps
正确吗，因为我编码的目的是产生与make_timesteps
相同的结果。根据您的翻转建议，我的代码产生的结果与使用make_timesteps
产生的结果不同。澄清这一点？很有趣-翻转需要与我从原作中获得的输出相同。对输出的目视检查也证实了这一点。不知道为什么-numpy@latest和py3.5-我原来的返回数组（[[1,2]，[3,4]，[[3,4]，[5,6]，[[5,6]，[7,8]，[[7,8]，[1,2]]）

@nickyzee更新了两个向前和向后跨步版本。看看那些！