Python numpy：使用不同的列类型快速创建重新排列_Python_Numpy_Recarray

Python numpy：使用不同的列类型快速创建重新排列

python numpy

Python numpy：使用不同的列类型快速创建重新排列,python,numpy,recarray,Python,Numpy,Recarray,我试图从一系列具有列名和混合变量类型的numpy数组中创建一个重新数组以下方法可以工作，但速度较慢： import numpy as np a = np.array([1,2,3,4], dtype=np.int) b = np.array([6,6,6,6], dtype=np.int) c = np.array([-1.,-2.-1.,-1.], dtype=np.float32) d = np.array(list(zip(a,b,c,d)),dt

我试图从一系列具有列名和混合变量类型的numpy数组中创建一个重新数组

以下方法可以工作，但速度较慢：

    import numpy as np
    a = np.array([1,2,3,4], dtype=np.int)
    b = np.array([6,6,6,6], dtype=np.int)
    c = np.array([-1.,-2.-1.,-1.], dtype=np.float32)
    d = np.array(list(zip(a,b,c,d)),dtype = [('a',np.int),('b',np.int),('c',np.float32)])
    d = d.view(np.recarray())

我认为应该有一种方法可以使用np.stack（（a，b，c），axis=-1）来实现这一点，它比list（zip（））方法更快。但是，似乎没有一种简单的方法来堆叠保存列类型。它似乎展示了如何做到这一点，但它相当笨重，我希望有一个更好的方法

谢谢你的帮助

np.rec.fromarrays

可能就是您想要的：

>>> np.rec.fromarrays([a, b, c], names=['a', 'b', 'c'])
rec.array([(1, 6, -1.), (2, 6, -2.), (3, 6, -1.), (4, 6, -1.)],
          dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')])

>>np.rec.fromArray（[a，b，c]，name=[a'，'b'，'c']）
记录数组（[（1,6，-1.），（2,6，-2.），（3,6，-1.），（4,6，-1.）]，
dtype=[（'a'，'以下是我评论的逐字段方法：
In [308]:     a = np.array([1,2,3,4], dtype=np.int)
     ...:     b = np.array([6,6,6,6], dtype=np.int)
     ...:     c = np.array([-1.,-2.,-1.,-1.], dtype=np.float32)
     ...:     dt = np.dtype([('a',np.int),('b',np.int),('c',np.float32)])
     ...: 
     ...: 

（我必须更正您粘贴的副本c
）

使用堆栈的唯一方法是首先创建重新排列：
In [315]: [np.rec.fromarrays((i,j,k), dtype=dt) for i,j,k in zip(a,b,c)]
Out[315]: 
[rec.array((1, 6, -1.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')]),
 rec.array((2, 6, -2.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')]),
 rec.array((3, 6, -1.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')]),
 rec.array((4, 6, -1.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')])]
In [316]: np.stack(_)
Out[316]: 
array([(1, 6, -1.), (2, 6, -2.), (3, 6, -1.), (4, 6, -1.)],
      dtype=(numpy.record, [('a', '<i8'), ('b', '<i8'), ('c', '<f4')]))

[315]中的：[np.rec.fromArray（（i，j，k），dtype=dt）表示zip（a，b，c）中的i，j，k）]
出[315]：
[rec.array（（1,6，-1.），
dtype=[（'a'，list（zip…
创建一个元组列表，这是结构化数组的标准输入。另一种方法是分配数组，然后逐字段复制值。这是大多数recfunctions
代码所做的（包括rec.fromarrays
）.np.stack
是一种形式的连接
，仅适用于具有匹配的dtype的数组。不要将重新数组字段与维度混淆。谢谢！使用for循环实际上很快（在6个~50k个字段的1D数组上需要~0.05秒），并且比我最初发布的方法更快（在同一数据集上花费了大约0.8秒）。
In [312]: np.array(list(zip(a,b,c)), dtype=dt)
Out[312]: 
array([(1, 6, -1.), (2, 6, -2.), (3, 6, -1.), (4, 6, -1.)],
      dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')])

_array = recarray(shape, descr)
# populate the record array (makes a copy)
for i in range(len(arrayList)):
    _array[_names[i]] = arrayList[i]

In [315]: [np.rec.fromarrays((i,j,k), dtype=dt) for i,j,k in zip(a,b,c)]
Out[315]: 
[rec.array((1, 6, -1.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')]),
 rec.array((2, 6, -2.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')]),
 rec.array((3, 6, -1.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')]),
 rec.array((4, 6, -1.),
           dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<f4')])]
In [316]: np.stack(_)
Out[316]: 
array([(1, 6, -1.), (2, 6, -2.), (3, 6, -1.), (4, 6, -1.)],
      dtype=(numpy.record, [('a', '<i8'), ('b', '<i8'), ('c', '<f4')]))