在C+中运行TensorFlow时，会产生巨大的RAM成本+； P>我现在在C++的API上使用一个经过训练的张量流图在GPU机上编写一些推理代码。_C++_Tensorflow

在C+中运行TensorFlow时，会产生巨大的RAM成本+； P>我现在在C++的API上使用一个经过训练的张量流图在GPU机上编写一些推理代码。

c++ tensorflow

在C+中运行TensorFlow时，会产生巨大的RAM成本+； P>我现在在C++的API上使用一个经过训练的张量流图在GPU机上编写一些推理代码。,c++,tensorflow,C++,Tensorflow,以下是我的设置：平台：CentOS 7 TensorFlow版本：TensorFlow 1.5 CUDA版本：CUDA 9.0 C++版本：C++11 有几个问题我正在努力解决 1）首先，我学习了C++中的一个基本的图形加载模板。本教程中的示例非常简单，但当我（在GPU机器上）运行该程序时，它几乎占用RAM中的0.9G 2）我的图表比那个教程中的要复杂得多。大约有20层，层中的节点数从300到5000不等我的（伪）代码片段在这里。为简单起见，我只保留导致（潜在）内存问题的代码： te

以下是我的设置：

平台：CentOS 7
TensorFlow版本：TensorFlow 1.5
CUDA版本：CUDA 9.0
C++版本：C++11

有几个问题我正在努力解决

1）首先，我学习了C++中的一个基本的图形加载模板。本教程中的示例非常简单，但当我（在GPU机器上）运行该程序时，它几乎占用RAM中的0.9G

2）我的图表比那个教程中的要复杂得多。大约有20层，层中的节点数从300到5000不等

我的（伪）代码片段在这里。为简单起见，我只保留导致（潜在）内存问题的代码：

tensorflow::Tensor input = getDataFromSomewhere(...);
int length = size of the input;
int g_batch_size = 50;

// 1) Create session...
// 2) Load graph...

// 3) Inference
for (int x = 0; x < length; x += g_batch_size) {

    tensorflow::Tensor out;
    auto cur_slice = input.Slice(x, std::min(x + g_batch_size, length));

    inference(cur_slice, out);

    // doSomethingWithOutput(out);
}

// 4) Close session and free session memory


// Inference helper function
tensorflow::Status inference(tensorflow::Tensor& input_tensors, tensorflow::Tensor& out) {

    // This line increases a lot more memory usage
    TensorDict feed_dict = {{"IteratorGetNext:0", input_tensors}};
    std::vector<tensorflow::Tensor> outputs;

    tensorflow::Status status = session->Run(feed_dict, {"final_dense:0"}, {}, &outputs);

    // UpdateOutWithOutputs();

    return tensorflow::Status::OK();
}

tensorflow:：Tensor input=getdatafromwhere（…）；
int length=输入的大小；
int g_批量大小=50；
//1）创建会话。。。
//2）负载图。。。
//3）推理
对于（int x=0；xRun（feed_dict，{“final_dense:0”}，{}，&outputs）；
//UpdateOutWithOutput（）；
返回tensorflow:：Status:：OK（）；
}

创建会话并加载图形后，内存开销约为1.2G

然后，正如我在代码中指出的，当程序到达

会话->运行（…）

时，内存使用量增加到超过2G

我不确定这是否是TensorFlow的正常行为。我已经检查并执行了线程，但我不知道我是否在代码中创建了冗余操作

任何意见或建议都将不胜感激！提前谢谢

我发现的问题是

Tensorflow

动态库将占用200MB的内存，

CUDA

动态库将占用500MB的内存。因此，加载这些库已经占用了大量内存