Python 转换为共享字符串数组的字符串Numpy矩阵会导致类型不匹配_Python_Numpy_Multiprocessing_Ctypes_Shared Memory

Python 转换为共享字符串数组的字符串Numpy矩阵会导致类型不匹配

python numpy

Python 转换为共享字符串数组的字符串Numpy矩阵会导致类型不匹配,python,numpy,multiprocessing,ctypes,shared-memory,Python,Numpy,Multiprocessing,Ctypes,Shared Memory,我正在试验，但是，我在创造一些方面遇到了困难。下面的例子说明了我的问题：关于（他使用的是一个充满浮点数的矩阵，虽然略有不同，但原理相同），我想将字符串的numpy矩阵转换为一个供进程使用的空间。我有以下资料： from ctypes import c_wchar_p import numpy as np from multiprocessing.sharedctypes import Array input_array = np.array([['Red', 'Green', 'Blue',

我正在试验，但是，我在创造一些方面遇到了困难。下面的例子说明了我的问题：

关于（他使用的是一个充满浮点数的矩阵，虽然略有不同，但原理相同），我想将字符串的numpy矩阵转换为一个供进程使用的空间。我有以下资料：

from ctypes import c_wchar_p
import numpy as np
from multiprocessing.sharedctypes import Array

input_array = np.array([['Red', 'Green', 'Blue', 'Yellow'],
                        ['Purple', 'Orange', 'Cyan', 'Pink']]).T

shared_memory = Array(c_wchar_p, input_array.size, lock=False) # Equivalent to just using a RawArray
np_wrapper = np.frombuffer(shared_memory, dtype='<U1').reshape(input_array.shape)
np.copyto(np_wrapper, input_array)
print(np_wrapper)

我已尝试纠正问题的事项：

我尝试从

更改函数的dtype
，这可能有助于理解此字符串数组包含的内容：
In [643]: input_array = np.array([['Red', 'Green', 'Blue', 'Yellow'],
     ...:                         ['Purple', 'Orange', 'Cyan', 'Pink']]).T
     ...: 
     ...:                         
In [644]: input_array.size
Out[644]: 8
In [645]: input_array.itemsize
Out[645]: 24
In [646]: input_array.nbytes
Out[646]: 192

因为它是一个转置，所以形状和步幅与输入数组不同，但字符串是按原始顺序排列的
In [647]: input_array.__array_interface__
Out[647]: 
{'data': (139792902236880, False),
 'strides': (24, 96),
 'descr': [('', '<U6')],
 'typestr': '<U6',
 'shape': (4, 2),
 'version': 3}

[647]中的：输入数组__
出[647]：
{'data'：（139792902236880，假），
"跨步":(24,96)，
“描述”：[（''，前言
在详细介绍我的解决方案之前，我想在回答之前先介绍一些有用的信息。python中的函数memoryview（）
在获得完整信息方面非常有用。例如，在指定input\u array
的数据类型为dtype='S6'
（b/c要检查的字节数减少）：
然后得出以下结果：
b'Red\x00\x00\x00PurpleGreen\x00OrangeBlue\x00\x00Cyan\x00\x00YellowPink\x00\x00'

我们可以从下面的输出中看到，input\u数组中的每个条目都有6个字节的长度，并且被放置在一个连续的内存块中。这告诉我们，Numpy数组不仅仅是指向内存中字符串的8个指针
回到dtype
未指定时，@hpaulj还提供了更有用的见解。通过阅读，我们的数组在input\u数组中有type，字符串表示为U6
（24字节）项目，都打包在数组的数据\u缓冲区中
。它们不会像在列表（或对象数据类型数组）中那样引用内存中其他位置的字符串。检查输入\u数组。项目大小。是的，我已经找到了解决方案，请给我一秒钟时间发布我的解决方案（我们说话时按字面意思键入）
print(bytes(memoryview(input_array)))

b'Red\x00\x00\x00PurpleGreen\x00OrangeBlue\x00\x00Cyan\x00\x00YellowPink\x00\x00'

<  -- Little-Endian (b/c I am on an Intel-based system)
U  -- Unicode String (Remember with 4 bytes per Unicode String)
6  -- 24 bytes per entry in the array.

from ctypes import c_char
import numpy as np
from multiprocessing.sharedctypes import Array

input_array = np.array([['Red', 'Green', 'Blue', 'Yellow'],
                        ['Purple', 'Orange', 'Cyan', 'Pink']]).T

shared_memory = Array(c_char, input_array.size * input_array.itemsize, lock=False)
np_wrapper = np.frombuffer(shared_memory, dtype=input_array.dtype).reshape(input_array.shape)
np.copyto(np_wrapper, input_array)

print(shared_memory[:])
print(np_wrapper)