Python 2.7 使用h5py沿新轴向现有h5py文件添加数据_Python 2.7_Numpy_Hdf5_H5py

Python 2.7 使用h5py沿新轴向现有h5py文件添加数据

python-2.7 numpy

Python 2.7 使用h5py沿新轴向现有h5py文件添加数据,python-2.7,numpy,hdf5,h5py,Python 2.7,Numpy,Hdf5,H5py,我有一些生成3d Numpy数组的示例代码——然后使用h5文件将这些数据保存到h5py文件中。然后，如何沿第四维“附加”第二个数据集？或者，如何沿现有.h5文件的第四维（或新轴）编写另一个3d数据集？我已经阅读了我能找到的文档，没有一个例子能够解决这个问题。我的代码如下所示： import h5py import numpy as np dataset1 = np.random.rand(240,240,250); dataset2 = np.random.rand(240,240,250);

我有一些生成3d Numpy数组的示例代码——然后使用h5文件将这些数据保存到h5py文件中。然后，如何沿第四维“附加”第二个数据集？或者，如何沿现有

.h5

文件的第四维（或新轴）编写另一个3d数据集？我已经阅读了我能找到的文档，没有一个例子能够解决这个问题。我的代码如下所示：

import h5py
import numpy as np

dataset1 = np.random.rand(240,240,250);
dataset2 = np.random.rand(240,240,250);

with h5py.File('data.h5', 'w') as hf:
    dset = hf.create_dataset('dataset_1', data=dataset1)

使用我做了一点实验：

In [504]: import h5py
In [505]: f=h5py.File('data.h5','w')
In [506]: data=np.ones((3,5))

制作一个普通的

数据集

：

In [509]: dset=f.create_dataset('dset', data=data)
In [510]: dset.shape
Out[510]: (3, 5)
In [511]: dset.maxshape
Out[511]: (3, 5)

调整大小的帮助信息：

In [512]: dset.resize?
Signature: dset.resize(size, axis=None)
Docstring:
Resize the dataset, or the specified axis.

The dataset must be stored in chunked format; it can be resized up to
the "maximum shape" (keyword maxshape) specified at creation time.
The rank of the dataset cannot be changed.

因为我没有指定

maxshape

，所以看起来我无法更改或添加到此数据集

In [513]: dset1=f.create_dataset('dset1', data=data, maxshape=(2,10,10))
...
ValueError: "maxshape" must have same rank as dataset shape

所以我不能定义一个3d“空间”并在其中放置一个2d数组——至少不是这样

但是我可以在

数据中添加维度（秩）：
In [514]: dset1=f.create_dataset('dset1', data=data[None,...], maxshape=(2,10,10))
In [515]: dset1
Out[515]: <HDF5 dataset "dset1": shape (1, 3, 5), type "<f8">

因此，您可以将两个数据集
放在一个h5
数据集中，前提是您指定了足够大的maxshape
，例如（2240240250）或（240240240500）或（240240250,2）等
或用于无限大小调整maxshape=（无、240、240、250））

看起来主要的限制是创建后不能添加标注
另一种方法是在存储之前连接数据，例如
dataset12 = np.stack((dataset1, dataset2), axis=0)

In [521]: dset1[1,:,:]=10
In [523]: dset1[0,:,5:]=2

In [524]: dset1[:]
Out[524]: 
array([[[  1.,   1.,   1.,   1.,   1.,   2.,   2.,   2.,   2.,   2.],
        [  1.,   1.,   1.,   1.,   1.,   2.,   2.,   2.,   2.,   2.],
        [  1.,   1.,   1.,   1.,   1.,   2.,   2.,   2.,   2.,   2.]],

       [[ 10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.],
        [ 10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.],
        [ 10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.,  10.]]])

dataset12 = np.stack((dataset1, dataset2), axis=0)