Python Tensorflow：如何批处理使用numpy数组构造的数据集？_Python_Tensorflow_Shapes_Tensorflow Datasets

Python Tensorflow：如何批处理使用numpy数组构造的数据集？

python tensorflow

Python Tensorflow：如何批处理使用numpy数组构造的数据集？,python,tensorflow,shapes,tensorflow-datasets,Python,Tensorflow,Shapes,Tensorflow Datasets,我试图理解Dataset.batch的行为。下面是我用来尝试通过基于numpy数组的Dataset在批处理数据上设置迭代器的代码 ## experiment with a numpy dataset sample_size = 100000 ncols = 15 batch_size = 1000 xarr = np.ones([sample_size, ncols]) * [i for i in range(ncols)] xarr = xarr

我试图理解

Dataset.batch

的行为。下面是我用来尝试通过基于

numpy

数组的

Dataset

在批处理数据上设置迭代器的代码

    ## experiment with a numpy dataset
    sample_size = 100000
    ncols = 15
    batch_size = 1000
    xarr = np.ones([sample_size, ncols]) * [i for i in range(ncols)]
    xarr = xarr + np.random.normal(scale = 0.5, size = xarr.shape)
    yarr = np.sum(xarr, axis = 1)
    self.x_placeholder = tf.placeholder(xarr.dtype, [None, ncols])
    self.y_placeholder = tf.placeholder(yarr.dtype, [None, 1])

    dataset = tf.data.Dataset.from_tensor_slices((self.x_placeholder, self.y_placeholder))
    dataset.batch(batch_size)
    self.iterator  = dataset.make_initializable_iterator()

    X, y  = self.iterator.get_next()

但是，当我检查X和y的形状时，它们是

(Pdb) X.shape
TensorShape([Dimension(15)])
(Pdb) y.shape
TensorShape([Dimension(1)])

这让我感到困惑，因为似乎没有考虑我的批量大小。在构建模型时，它还会导致下游出现问题，因为我希望X和y有两个维度，第一个维度是批次中的示例数

问题：为什么迭代器的输出是一维的？我应该如何正确地批处理

以下是我尝试过的：

无论我是否将
```
批处理
```
函数应用于数据集，
```
X
```
和
```
y
```
的
```
形状都是相同的
```


更改输入到占位符中的形状（例如，将None
替换为batch\u size
）也不会更改行为


感谢您的建议/更正等。
为了考虑批量大小，您需要更改以下内容
dataset.batch(batch_size)

到
dataset = dataset.batch(batch_size)