Python 当数组具有不同长度的字符串时，记录数组上的numpy.concatenate失败_Python_Numpy

Python 当数组具有不同长度的字符串时，记录数组上的numpy.concatenate失败

python numpy

Python 当数组具有不同长度的字符串时，记录数组上的numpy.concatenate失败,python,numpy,Python,Numpy,当尝试连接具有数据类型字符串字段但长度不同的记录数组时，连接失败正如您在下面的示例中所看到的，如果“f1”具有相同的长度，则“连接”会起作用，如果不是，则会失败 In [1]: import numpy as np In [2]: a = np.core.records.fromarrays( ([1,2], ["one","two"]) ) In [3]: b = np.core.records.fromarrays( ([3,4,5], ["three","four","three"]

当尝试连接具有数据类型字符串字段但长度不同的记录数组时，连接失败

正如您在下面的示例中所看到的，如果“f1”具有相同的长度，则“连接”会起作用，如果不是，则会失败

In [1]: import numpy as np

In [2]: a = np.core.records.fromarrays( ([1,2], ["one","two"]) )

In [3]: b = np.core.records.fromarrays( ([3,4,5], ["three","four","three"]) )

In [4]: c = np.core.records.fromarrays( ([6], ["six"]) )

In [5]: np.concatenate( (a,c) )
Out[5]: 
array([(1, 'one'), (2, 'two'), (6, 'six')], 
      dtype=[('f0', '<i8'), ('f1', '|S3')])

In [6]: np.concatenate( (a,b) )
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/u/jegannas/<ipython console> in <module>()

TypeError: expected a readable buffer object

这是连接记录时连接中的错误还是预期的行为。我只想到了以下克服这个问题的方法

In [10]: np.concatenate( (a.astype(b.dtype), b) )
Out[10]: 
array([(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four'), (5, 'three')], 
      dtype=[('f0', '<i8'), ('f1', '|S5')]

[10]中的

：np.连接（（a.astype（b.dtype），b））
出[10]：
数组（[（1，'一'），（2，'二'），（3，'三'），（4，'四'），（5，'三'）]，
dtype=[（'f0'，'当您不指定数据类型时，np.rec.fromarrays
（akanp.core.records.fromarrays
）尝试为您猜测数据类型。因此
In [4]: a = np.core.records.fromarrays( ([1,2], ["one","two"]) )

In [5]: a
Out[5]: 
rec.array([(1, 'one'), (2, 'two')], 
      dtype=[('f0', '<i4'), ('f1', '|S3')])

然后，连接将根据需要工作：
In [11]: np.concatenate( (a,b))
Out[11]: 
array([(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four'), (5, 'three')], 
      dtype=[('f0', '<i4'), ('f1', '|S8')])

[11]中的：np.连接（（a，b））
出[11]：
数组（[（1，'一'），（2，'二'），（3，'三'），（4，'四'），（5，'三'）]，
dtype=[（'f0'，'会numpy.lib.recfunctions.merge_数组
适合你吗？recfunctions
是一个鲜为人知的子包，它没有被大量宣传，有点笨重，但有时可能有用。
发布完整的答案。正如Pierre GM所建议的模块：
import numpy.lib.recfunctions

给出了一个解决方案。但是，实现所需功能的函数是：
numpy.lib.recfunctions.stack_arrays((a,b), autoconvert=True, usemask=False)

（usemask=False
只是为了避免创建可能未使用的屏蔽数组。重要的是autoconvert=True
强制将a
的dtype
转换为“|S3”
”到“|S5”
）.
这里的numpy版本是什么。我没有看到这个。@SenthilBabu您必须明确地导入它import numpy.lib.recfunctions
。（除此之外，它至少存在于1.6版本中，我不认为我需要mergearrays
。只有当我有未定义的值时，它才可能有用。否则，mergearrays
与numpy.core.records.fromarrays
完全相同，但是也有stack\u数组具有自动转换关键字：numpy.lib.recfunctions.stack\arrays（（a，b），autoconvert=True，usemack=False）
我知道。这就是我的问题本身。所有这些单独的数组都是由一个子模块生成的（我无法控制）。我只需要连接所有这些数组。与其跟踪每个数组的最大大小，我建议您提前选择一个足够大的数字n
，以容纳所有字符串。这与您在问题中提出的想法不同。但是，如果这样的数字n
不能事先知道，您可以使用'object'dtype来代替。我也编辑了我的文章来演示这一点。
In [35]: a = np.core.records.fromarrays( ([1,2], ["one","two"]), dtype = [('f0', '<i4'), ('f1', 'object')])

In [36]: b = np.core.records.fromarrays( ([3,4,5], ["three","four","three"]), dtype = [('f0', '<i4'), ('f1', 'object')])

In [37]: np.concatenate( (a,b))
Out[37]: 
array([(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four'), (5, 'three')], 
      dtype=[('f0', '<i4'), ('f1', '|O4')])

import numpy.lib.recfunctions

numpy.lib.recfunctions.stack_arrays((a,b), autoconvert=True, usemask=False)