Arrays 理解dask数组块
我试图通过阅读文档来了解数组分块是如何工作的 下面是一个Python会话的输出,我尝试在其中重现示例Arrays 理解dask数组块,arrays,dask,Arrays,Dask,我试图通过阅读文档来了解数组分块是如何工作的 下面是一个Python会话的输出,我尝试在其中重现示例 In [1]: import numpy as np In [2]: npa = np.array([ ...: [1, 2, 3, 4, 5, 6], ...: [7, 8, 9, 0, 1, 2], ...: [3, 4, 5, 6, 7, 8], ...: [9, 0, 1, 2, 3, 4], ...: [5, 6, 7, 8, 9, 0], ...
In [1]: import numpy as np
In [2]: npa = np.array([
...: [1, 2, 3, 4, 5, 6],
...: [7, 8, 9, 0, 1, 2],
...: [3, 4, 5, 6, 7, 8],
...: [9, 0, 1, 2, 3, 4],
...: [5, 6, 7, 8, 9, 0],
...: [1, 2, 3, 4, 5, 6]
...: ])
In [3]: import dask.array as da
In [4]: a = da.from_array(npa, chunks=3)
In [5]: a
Out[5]: dask.array<array, shape=(6, 6), dtype=int64, chunksize=(3, 3), chunktype=numpy.ndarray>
正如预期的那样,给定形状,我只能读出两个块。当我读出第三个块时,索引器
被触发
In [7]: a.blocks[2]
...
IndexError: Index is not smaller than dimension 2 >= 2
我预计会有四个3x3街区,而不是两个3x6街区
关于数组分块在dask中的工作原理,我还不了解什么
您的块/块是二维的,因此您可以在二维中对它们进行索引
同意,这看起来确实很奇怪。也许有什么改变了?我建议提交一份文件。在上面概述的例子中,dask构建了两个(3,6)数组,考虑到docsThanks@joshreback,这很奇怪
In [7]: a.blocks[2]
...
IndexError: Index is not smaller than dimension 2 >= 2