Python 大维输入的张量流估计图的尺寸限制_Python_Tensorflow_Tensorflow Estimator

Python 大维输入的张量流估计图的尺寸限制

python tensorflow

Python 大维输入的张量流估计图的尺寸限制,python,tensorflow,tensorflow-estimator,Python,Tensorflow,Tensorflow Estimator,我想我所有的训练数据都存储在图形中，它达到了2gb的极限。如何在estimator API中使用提要？仅供参考，我正在使用tensorflow估计器API来训练我的模型输入功能： def input_fn(X_train,epochs,batch_size): ''' input X_train is the scipy sparse matrix of large input dimensions(200000) and number of rows=600000''' X_train_t

我想我所有的训练数据都存储在图形中，它达到了2gb的极限。如何在estimator API中使用提要？仅供参考，我正在使用tensorflow估计器API来训练我的模型

输入功能：

def input_fn(X_train,epochs,batch_size):
''' input X_train is the scipy sparse matrix of large input dimensions(200000) and number of rows=600000'''

X_train_tf = tf.data.Dataset.from_tensor_slices((convert_sparse_matrix_to_sparse_tensor(X_train, tf.float32)))
    X_train_tf = X_train_tf.apply(tf.data.experimental.shuffle_and_repeat(shuffle_to_batch*batch_size, epochs))
    X_train_tf = X_train_tf.batch(batch_size).prefetch(2)
    return X_train_tf

错误：

def input_fn(X_train,epochs,batch_size):
''' input X_train is the scipy sparse matrix of large input dimensions(200000) and number of rows=600000'''

X_train_tf = tf.data.Dataset.from_tensor_slices((convert_sparse_matrix_to_sparse_tensor(X_train, tf.float32)))
    X_train_tf = X_train_tf.apply(tf.data.experimental.shuffle_and_repeat(shuffle_to_batch*batch_size, epochs))
    X_train_tf = X_train_tf.batch(batch_size).prefetch(2)
    return X_train_tf

回溯（最近一次调用上次）：文件 “/tmp/apprunner/.working/runtime/app/ae_python_tf.py”，第259行，在 AE_Regressor.train（lambda:input_fn（X_train，epoch，batch_size），hooks=[time_hist，logging_hook]）文件 “/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/estimator/estimator.py”， 354号线，列车上 loss=self.\u train\u model（输入、钩子、保存侦听器）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/estimator/estimator.py”， 1205号线，列车模型返回self.\u train\u model\u分布式（输入\u fn、挂钩、保存\u侦听器）文件 “/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/estimator/estimator.py”， 1352号线，列车内车型分布保存_侦听器）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/estimator/estimator.py”， 1468号线，带估计器规格的列车 log_step_count_steps=log_step_count_steps）作为mon_sess:File“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/training/monitored_session.py”，第504行，在MonitoredTrainingSession中 stop_grace_period_secs=stop_grace_period_secs）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/training/monitored_session.py”，第921行，在init stop_grace_period_secs=stop_grace_period_secs）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/training/monitored_session.py”，第631行，在init h、 begin（）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/training/basic\u session\u run\u hooks.py”，第543行，开始 self.\u summary\u writer=SummaryWriterCache.get（self.\u checkpoint\u dir）文件 “/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/summary/writer/writer_cache.py”，第63行，进站 logdir，graph=ops.get_default_graph（））文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/summary/writer/writer.py”，第367行，在init 超级（FileWriter，self）。init（事件写入程序，图形，图形定义）文件 “/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/summary/writer/writer.py”，第83行，在init self.add_graph（graph=graph，graph_def=graph_def）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/summary/writer/writer.py”，第193行，在加法图中 true_graph_def=graph.as_graph_def（add_shapes=true）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/framework/ops.py”，第3124行，如图所示结果，u=self._as_graph_def（来自_版本，添加_形状）文件“/tmp/apprunner/.working/runtime/env/lib/python3.5/site packages/tensorflow/python/framework/ops.py”，第3082行，以图形形式显示 c_api.TF_GraphToGraphDef（self._c_graph，buf）tensorflow.python.framework.errors_impl.invalidargumeinterror:无法将tensorflow.GraphDef类型的协议缓冲区序列化为序列化大小（2838040852字节）将大于限制（2147483647字节）

我通常反对逐字引用文档，但在中逐字解释了这一点，我无法找到比他们更好的方法：

请注意，[使用

Dataset.from_tensor_slices（）

features

和

labels

numpy数组]将嵌入特征和标签将TensorFlow图中的数组作为tf.constant（）操作。这适用于小数据集，但会浪费内存，因为数组的内容将被复制多次，并且可以运行到 tf.GraphDef协议缓冲区的2GB限制

另一种方法是，您可以根据以下内容定义数据集： tf.placeholder（）张量，并在初始化数据集上的迭代器

（代码和文本均取自上述链接，删除了代码中与问题无关的一个assert）

更新如果您试图将它与估计器API一起使用，那么您就不走运了。在同一链接页面中，前面引用的部分上方有几个部分：

注意：目前，单次迭代器是唯一一种易于使用估计器的类型

正如您在评论中所指出的，这是因为Estimator API隐藏了

sess.run（）

调用，您需要在其中为迭代器传递

feed\u dict

。

如果您使用Estimator，您可以通过SessionRunHook来完成此操作。

有人可以添加tensorflow开发人员吗？我没有学分来添加它们。我觉得这是一个适合大规模tensorflow训练的好问题。谢谢GPhilo。在我的问题中，我没有提到我一直在使用tensorflow估计器API。我确实想这样做，但tensorflow estimator API具有运行训练的训练函数，不将提要作为输入。代码sess.run（…）在估计器的训练函数中运行。有人知道如何将feed_dict与tensorflow estimator API结合使用吗？这是您应该添加到问题中的信息；）看到我的更新他们提到“容易使用”，我想知道“容易”对他们意味着什么。我不知道是否有办法让它发挥作用，但我的直觉是目前还没有。我也尝试过一次性迭代器，但它们在使用estimator进行培训时也会存储输入