Python 使用array.dtype=<；分配数据类型值；数据类型>；在NumPy中，数组给出了不明确的结果_Python_Numpy_Ipython_Jupyter Notebook

Python 使用array.dtype=<；分配数据类型值；数据类型>；在NumPy中，数组给出了不明确的结果

python numpy ipython jupyter-notebook

Python 使用array.dtype=<；分配数据类型值；数据类型>；在NumPy中，数组给出了不明确的结果,python,numpy,ipython,jupyter-notebook,Python,Numpy,Ipython,Jupyter Notebook,我是新的编程和numpy。。。在阅读教程和在jupyter笔记本上进行实验时。。。我考虑如下转换numpy数组的数据类型： import numpy as np c = np.random.rand(4)*10 print c #Output1: [ 0.12757225 5.48992242 7.63139022 2.92746857] c.dtype = int print c #Output2: [4593764294844833304 4617867121563982285 462

我是新的编程和numpy。。。在阅读教程和在jupyter笔记本上进行实验时。。。我考虑如下转换numpy数组的数据类型：

import numpy as np
c = np.random.rand(4)*10
print c
#Output1: [ 0.12757225  5.48992242  7.63139022  2.92746857]
c.dtype = int
print c
#Output2: [4593764294844833304 4617867121563982285 4620278199966380988 4613774491979221856]

我知道正确的改变方式是：

c = c.astype(int)

但我想解释一下Output2中那些模棱两可的数字背后的原因。它们是什么以及它们代表什么？

浮点和整数（

numpy.float64

s和

numpy.int64

s）在内存中的表示方式不同。存储在这些不同类型中的值42对应于存储器中的不同位模式

当重新分配数组的

dtype

属性时，底层数据保持不变，并告诉numpy以新的方式解释该位模式。由于现在的解释与数据的原始定义不匹配，因此最终会出现胡言乱语（无意义的数字）

另一方面，通过

.astype（）

转换数组实际上会转换内存中的数据：

>>> import numpy as np
>>> arr = np.random.rand(3)
>>> arr.dtype
dtype('float64')
>>> arr
array([ 0.7258989 ,  0.56473195,  0.20885672])
>>> arr.data
<memory at 0x7f10d7061288>
>>> arr.dtype = np.int64
>>> arr.data
<memory at 0x7f10d7061348>
>>> arr
array([4604713535589390862, 4603261872765946451, 4596692876638008676])

如您所见，使用

astype

将有意义地转换数组的原始值，在这种情况下，它将截断为整数部分，并返回一个带有相应值和

dtype

的新数组

请注意，分配新的

dtype

不会触发任何检查，因此可以对数组执行非常奇怪的操作。在上面的示例中，64位浮点被重新解释为64位整数。但您也可以更改位大小：

>>> arr = np.random.rand(3)
>>> arr.shape
(3,)
>>> arr.dtype
dtype('float64')
>>> arr.dtype = np.float32
>>> arr.shape
(6,)
>>> arr
array([  4.00690371e+35,   1.87285304e+00,   8.62005305e+13,
         1.33751166e+00,   7.17894062e+30,   1.81315207e+00], dtype=float32)

通过告诉numpy您的数据占用的空间比原来的一半，numpy将推断您的数组中的元素数量是原来的两倍！显然不是你应该做的

另一个例子：考虑8位无符号整数255＝2×*8-1：它对应于二进制中的11111111。现在，尝试将其中两个数字重新解释为单个16位无符号整数：

>>> arr = np.array([255,255],dtype=np.uint8)
>>> arr.dtype = np.uint16
>>> arr
array([65535], dtype=uint16)

如您所见，结果是单个数字65535。如果这还没有敲响警钟，那就是2**16-1，二进制模式中有16个。两个完整的1模式被重新解释为一个16位的数字，结果也随之改变。您经常看到更奇怪的数字的原因是，将浮点数重新解释为整数（反之亦然）会导致更严重的数据损坏，这是由于浮点数在内存中的表示方式

如前所述，您可以通过使用修改后的

dtype

构建新的数组来直接执行数据的重新解释。这可能比必须重新分配给定数组的

dtype

更有用，但是再次更改

dtype

只在非常罕见、非常特定的用例中有用。

我几乎惊讶于允许直接分配。感觉很危险。也允许直接指定

形状

，但我很少使用它。@hpaulj对于

步幅

，我宁愿不使用这些属性。

>>> arr = np.array([255,255],dtype=np.uint8)
>>> arr.dtype = np.uint16
>>> arr
array([65535], dtype=uint16)