如何并行化TensorFlow中的子图执行?
我试图提高TensorFlow中的GPU利用率,但我发现子图的执行并没有并行化。 以下是工作示例(tensorflow版本r.012): 首先,我们创建两个网络:如何并行化TensorFlow中的子图执行?,tensorflow,Tensorflow,我试图提高TensorFlow中的GPU利用率,但我发现子图的执行并没有并行化。 以下是工作示例(tensorflow版本r.012): 首先,我们创建两个网络: #specify two networks with random inputs as data with tf.device('/gpu:0'): # first network with tf.variable_scope('net1'): tf_data1 = tf.random_normal(s
#specify two networks with random inputs as data
with tf.device('/gpu:0'):
# first network
with tf.variable_scope('net1'):
tf_data1 = tf.random_normal(shape=[batch_size, input_dim])
w1 = tf.get_variable('w1', shape=[input_dim, num_hidden], dtype=tf.float32)
b1 = tf.get_variable('b1', shape=[num_hidden], dtype=tf.float32)
l1 = tf.add(tf.matmul(tf_data1, w1), b1)
w2 = tf.get_variable('w2', shape=[num_hidden, output_dim], dtype=tf.float32)
result1 = tf.matmul(l1, w2)
# second network
with tf.variable_scope('net2'):
tf_data2 = tf.random_normal(shape=[batch_size, input_dim])
w1 = tf.get_variable('w1', shape=[input_dim, num_hidden], dtype=tf.float32)
b1 = tf.get_variable('b1', shape=[num_hidden], dtype=tf.float32)
l1 = tf.add(tf.matmul(tf_data1, w1), b1)
w2 = tf.get_variable('w2', shape=[num_hidden, output_dim], dtype=tf.float32)
result2 = tf.matmul(l1, w2)
这是我们感兴趣的:
#the result that we are interested
out = tf.add(result1, result2)
现在,我们初始化并运行会话:
sess.run(tf.global_variables_initializer()) #initialize variables
# run out operation with trace
run_metadata = tf.RunMetadata()
sess.run(out,
options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE),
run_metadata=run_metadata )
# write trace to file
trace = timeline.Timeline(step_stats=run_metadata.step_stats)
trace_file = open('trace.ctf.json', 'w')
trace_file.write(trace.generate_chrome_trace_format())
在图中,我们可以看到以下内容:
- 第一个Matmul用于net1,第二个Matmul用于net2李>
sess.run(tf.global_variables_initializer()) #initialize variables
# run out operation with trace
run_metadata = tf.RunMetadata()
sess.run(out,
options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE),
run_metadata=run_metadata )
# write trace to file
trace = timeline.Timeline(step_stats=run_metadata.step_stats)
trace_file = open('trace.ctf.json', 'w')
trace_file.write(trace.generate_chrome_trace_format())