Tensorflow 如何并行处理单个培训文件_Tensorflow

Tensorflow 如何并行处理单个培训文件

tensorflow

Tensorflow 如何并行处理单个培训文件,tensorflow,Tensorflow,我有一个文件train.csv，其中包含图像及其标签的路径。即： img1.jpg 3 img2.jpg 1 ... 在浏览完之后，我想出了一些代码来浏览每个图像，调整其大小并应用扭曲： def apply_distortions(resized_image): # do a bunch of tf.image distortion... return float_image def processing(filename): file_contents = tf.r

我有一个文件

train.csv

，其中包含图像及其标签的路径。即：

img1.jpg 3
img2.jpg 1
...

在浏览完之后，我想出了一些代码来浏览每个图像，调整其大小并应用扭曲：

def apply_distortions(resized_image):
    # do a bunch of tf.image distortion...
    return float_image

def processing(filename):
    file_contents = tf.read_file(filename)
    image = tf.image.decode_jpeg(file_contents, channels=3)
    resized_image = tf.image.resize_images(image, 299, 299)
    distorted_image = apply_distortions(resized_image)
    return distorted_image

def parse_csv(filename_queue):
    line_reader = tf.TextLineReader()
    key, line = line_reader.read(filename_queue)
    filename, label = tf.decode_csv(line,     # line_batch or line (depending if you want to batch)
                               record_defaults=[tf.constant([],dtype=tf.string),
                                                tf.constant([],dtype=tf.int32)],
                               field_delim=' ')
    processed_image = processing(filename)
    return processed_image, label

现在的问题是，我不知道如何在文件中并行执行这些操作。文档建议使用

tf.train.batch\u join

或

tf.train.batch

和num\u threads=N

我首先尝试使用

tf.train.batch\u join

遵循示例代码，但这似乎是为了并行处理多个文件。在我的情况下，但我只有一个文件

filename_queue = tf.train.string_input_producer(["train.txt"], num_epochs=1, shuffle=True)    
example_list = [parse_csv(filename_queue) for _ in range(8)]
example_batch, label_batch = tf.train.batch_join(example_list, batch_size)

我还尝试设置

tf.train.batch（[示例，label]，batch\u size，num\u threads=8）

，但我不清楚这样做是否正确（尽管我可以看到更多的cpu内核在使用）

以下是我执行图形的代码：

sess.run(tf.initialize_all_variables())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess,coord)
try:
    while not coord.should_stop():
        X, Y = sess.run([example_batch, label_batch])
        # Now run a training step
except tf.errors.OutOfRangeError:
    print('Done training -- epoch limit reached')
finally:
    # When done, ask the threads to stop.
    coord.request_stop()
coord.join(threads)
sess.close()

并行处理此文件的最佳方法是什么？

这两种方法似乎都是可行的。使用

batch

和

threads=N

将创建连接到队列的读取器op的

副本，以便它们可以并行运行，而

batch\u join

您必须手动创建副本

在使用

batch\u join

时，您正在创建

TextLineReader

的多个副本，正如您所注意到的，这些副本只会在文件之间并行。要让多个线程读取单个文件，您可以创建一个

TextLineReader

，并使用同一个读取器创建多个

line\u读取器
下面是一些包含数字的文本文件的示例
生成数字：
num_files=10
num_entries_per_file=10
file_root="/temp/pipeline"
os.system('mkdir -p '+file_root)
for fi in range(num_files):
  fname = file_root+"/"+str(fi)
  dump_numbers_to_file(fname, fi*num_entries_per_file, (fi+1)*num_entries_per_file)

以大小为2的批量读取这些数字，并行度为2
ops.reset_default_graph()
filename_queue = tf.train.string_input_producer(["/temp/pipeline/0",
                                                 "/temp/pipeline/1"],
                                                shuffle=False)
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
numeric_val1, = tf.decode_csv(value, record_defaults=[[-1]])
numeric_val2, = tf.decode_csv(value, record_defaults=[[-1]])
numeric_batch = tf.batch_join([[numeric_val1,], [numeric_val2]], 2)
# have to create session before queue runners because they use default session
sess = create_session()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

print '\n'.join([t.name for t in threads])
for i in range(20):
  print sess.run([numeric_batch])

coord.request_stop()
coord.join(threads)

您可能会看到这样的情况：
QueueRunner(input_producer:input_producer/input_producer_EnqueueMany)
QueueRunner(input_producer:input_producer/input_producer_Close_1)
QueueRunner(batch_join/fifo_queue:batch_join/fifo_queue_enqueue)
QueueRunner(batch_join/fifo_queue:batch_join/fifo_queue_enqueue_1)
QueueRunner(batch_join/fifo_queue:batch_join/fifo_queue_Close_1)
[array([0, 1], dtype=int32)]
[array([2, 3], dtype=int32)]
[array([4, 5], dtype=int32)]
[array([6, 7], dtype=int32)]
[array([8, 9], dtype=int32)]
[array([10, 11], dtype=int32)]
[array([12, 13], dtype=int32)]
[array([14, 15], dtype=int32)]
[array([16, 17], dtype=int32)]
[array([18, 19], dtype=int32)]

从线程列表中，您可以看到有两个线程对应于读取操作（fifo\u queue\u enqueue
和fifo\u queue\u enqueue\u 1
，因此您可以并行执行两次读取）
您是否有使用同一个读卡器的多个line\u reader.read操作的示例代码？我是否需要关注那些正在同步或可能多次阅读同一行的读者？谢谢。使用一种或另一种方法会有什么影响？创建多个读卡器操作需要更多的代码，因此使用num_线程更简洁。但我不清楚这里是否存在折衷。您需要复制ops，因为QueueRunner
接口就是这样工作的——构造函数获取ops列表并在自己的线程中运行每个opbatch
使用QueueRunner
并为您复制排队操作，而使用batch\u join则需要手动复制操作，但具有更大的灵活性
QueueRunner(input_producer:input_producer/input_producer_EnqueueMany)
QueueRunner(input_producer:input_producer/input_producer_Close_1)
QueueRunner(batch_join/fifo_queue:batch_join/fifo_queue_enqueue)
QueueRunner(batch_join/fifo_queue:batch_join/fifo_queue_enqueue_1)
QueueRunner(batch_join/fifo_queue:batch_join/fifo_queue_Close_1)
[array([0, 1], dtype=int32)]
[array([2, 3], dtype=int32)]
[array([4, 5], dtype=int32)]
[array([6, 7], dtype=int32)]
[array([8, 9], dtype=int32)]
[array([10, 11], dtype=int32)]
[array([12, 13], dtype=int32)]
[array([14, 15], dtype=int32)]
[array([16, 17], dtype=int32)]
[array([18, 19], dtype=int32)]