Python 从2d numpy阵列创建数据历史记录？_Python_Arrays_Numpy

Python 从2d numpy阵列创建数据历史记录？

python arrays numpy

Python 从2d numpy阵列创建数据历史记录？,python,arrays,numpy,Python,Arrays,Numpy,假设我有一个形状为nxm的二维numpy数组（其中n是大数，m>=1）。每列代表一个属性。下面提供了n=5，m=3的示例： [[1,2,3], [4,5,6], [7,8,9], [10,11,12], [13,14,15]] 我想用history_steps=p（1

假设我有一个形状为nxm的二维numpy数组（其中n是大数，m>=1）。每列代表一个属性。下面提供了n=5，m=3的示例：

[[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]]

我想用history_steps=p（1dstack+

重塑：
a = np.array([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]])

# use `dstack` to stack the two arrays(one with last row removed, the other with first 
# row removed), along the third axis, and then use reshape to flatten the second and third
# dimensions
np.dstack([a[:-1], a[1:]]).reshape(a.shape[0]-1, -1)

#array([[ 1,  4,  2,  5,  3,  6],
#       [ 4,  7,  5,  8,  6,  9],
#       [ 7, 10,  8, 11,  9, 12],
#       [10, 13, 11, 14, 12, 15]])

n, m = a.shape
p = 3
np.dstack([a[i:(n-p+i+1)] for i in range(p)]).reshape(n-p+1, -1)

#array([[ 1,  4,  7,  2,  5,  8,  3,  6,  9],
#       [ 4,  7, 10,  5,  8, 11,  6,  9, 12],
#       [ 7, 10, 13,  8, 11, 14,  9, 12, 15]])

要概括为任意的p
，请使用列表理解生成移位数组的列表，然后执行堆栈+重塑
：
a = np.array([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12],
[13,14,15]])

# use `dstack` to stack the two arrays(one with last row removed, the other with first 
# row removed), along the third axis, and then use reshape to flatten the second and third
# dimensions
np.dstack([a[:-1], a[1:]]).reshape(a.shape[0]-1, -1)

#array([[ 1,  4,  2,  5,  3,  6],
#       [ 4,  7,  5,  8,  6,  9],
#       [ 7, 10,  8, 11,  9, 12],
#       [10, 13, 11, 14, 12, 15]])

n, m = a.shape
p = 3
np.dstack([a[i:(n-p+i+1)] for i in range(p)]).reshape(n-p+1, -1)

#array([[ 1,  4,  7,  2,  5,  8,  3,  6,  9],
#       [ 4,  7, 10,  5,  8, 11,  6,  9, 12],
#       [ 7, 10, 13,  8, 11, 14,  9, 12, 15]])

下面是一个基于NumPy的方法，重点是使用-
样本运行-
In [27]: a
Out[27]: 
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15]])

In [28]: strided_axis0(a, L=2)
Out[28]: 
array([[ 1,  4,  2,  5,  3,  6],
       [ 4,  7,  5,  8,  6,  9],
       [ 7, 10,  8, 11,  9, 12],
       [10, 13, 11, 14, 12, 15]])

几乎所有pandas函数在numpy中都有一个等价项，因为pandas在引擎盖下广泛使用numpy。你为什么不阅读numpy文档来了解它呢？（注意，在大多数情况下，用np.function
替换pd.function
是有效的！）是的。我同意。但是，不将数据拆分为列并进行缓冲怎么样？老实说，我没有完全理解你想要做的事情，你想要的输出背后的逻辑是什么…@Julien：列代表不同的属性。而行代表这些属性在特定时间戳的值。我想要的是训练一个关于属性序列的机器学习模型。我知道我可以做时间序列方法，也可能是RNN。但是，我不太了解它们。p是如何出现在这里的？如果我想让p=3更新一个方法来处理移位的呢。现在看起来很好。非常感谢！这是新的。我从来都不知道numpy中存在类似的东西。@GKS Yup，那就是np.lib.stride\u技巧。as\u strided可能是NumPy中最深奥、最有效的东西。在过去24小时内使用了三次它来回答以下问题：）
In [27]: a
Out[27]: 
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15]])

In [28]: strided_axis0(a, L=2)
Out[28]: 
array([[ 1,  4,  2,  5,  3,  6],
       [ 4,  7,  5,  8,  6,  9],
       [ 7, 10,  8, 11,  9, 12],
       [10, 13, 11, 14, 12, 15]])