进程未以python结尾_Python_Multiprocessing

进程未以python结尾

python

进程未以python结尾,python,multiprocessing,Python,Multiprocessing,我有一个脚本来读取一个文件，这个文件可以是10s大小，我想使用多处理来处理它这是一个压缩算法，我希望用户定义一个缓冲区，然后启动3个进程，一个从文件中读取缓冲区行数，将行传递给处理进程，然后将处理后的行传递给将行写入新文件的进程。我希望所有这些都同时发生，并且每个进程都等待下一批行我已经有了脚本，但是当我运行它时，它还没有结束。我认为这些过程有问题。我认为这与我的read函数中的islice有关，但我不知道如何更好地编写它 import multiprocessing as mp impor

我有一个脚本来读取一个文件，这个文件可以是10s大小，我想使用多处理来处理它

这是一个压缩算法，我希望用户定义一个缓冲区，然后启动3个进程，一个从文件中读取缓冲区行数，将行传递给处理进程，然后将处理后的行传递给将行写入新文件的进程。我希望所有这些都同时发生，并且每个进程都等待下一批行

我已经有了脚本，但是当我运行它时，它还没有结束。我认为这些过程有问题。我认为这与我的read函数中的islice有关，但我不知道如何更好地编写它

import multiprocessing as mp
import time
from itertools import islice

def read(from_filename, buffer, process_queue):
  file = open(from_filename, 'r')
  slice = islice(file, buffer)
  while slice:
    to_process = []
    for line in slice:
      to_process.append(line)
    process_queue.put(to_process)
  process_queue.put('kill')

def write(to_filename, write_queue):
  to_file = open(to_filename, 'a+')
  while 1:
    to_write = write_queue.get()
    if to_write == 'kill':
      break
    to_file.write(to_write + '\n')

def compress(process_queue, write_queue):
  while 1:
    to_process = process_queue.get()
    if to_process == 'kill':
      write_queue.put('kill')
      break
    # process, put output in to_write
    write_queue.put(to_write)

def decompress(process_queue, write_queue):
  while 1:
    to_process = process_queue.get()
    if to_process == 'kill':
      write_queue.put('kill')
      break
    # process, put output in to_write
    write_queue.put(to_write)

def main():
  option = raw_input("C for Compress OR D for Decompress: ")
  from_file = raw_input("Enter input filename: ")
  buf = int(raw_input("Enter line buffer: "))
  to_file = raw_input("Enter output filename: ")
  start = time.time()
  write_queue = mp.Queue()
  process_queue = mp.Queue()
  reader = mp.Process(target=read, args=(from_file, buf, process_queue))
  writer = mp.Process(target=write, args=(to_file, write_queue))
  if option == 'c' or option == 'C':
    processor = mp.Process(target=compress, args=(process_queue, write_queue))
  elif option == 'd' or option == 'D':
    processor = mp.Process(target=decompress, args=(process_queue, write_queue))
  else:
    print "Invalid Options..."
  writer.start()
  processor.start()
  reader.start()
  reader.join()
  processor.join()
  writer.join()
  end = time.time()
  elapsed = (end - start)
  print "\n\nTotal Time Elapsed: " + str(elapsed) + " secs"

if __name__=='__main__':
  main()

这是我第一次尝试多处理。

当我运行它时，它不会结束。我认为某个流程被卡住了。

这段代码是错误的：

def read(from_filename, buffer, process_queue):
  file = open(from_filename, 'r')
  slice = islice(file, buffer)
  while slice:
    to_process = []
    for line in slice:
      to_process.append(line)
    process_queue.put(to_process)
  process_queue.put('kill')

由于

slice

是一个

islice

对象，因此当slice为真时，条件

将始终为真，因此它就像在那里有一个while true
一样。每次都应该重新创建切片对象
def read(from_filename, buffer, process_queue):
  file = open(from_filename, 'r')

  while True:
    slice = islice(file, buffer)
    to_process = []
    for line in slice:
      to_process.append(line)
    process_queue.put(to_process)
    if not to_process:
        # input ended
        break
  process_queue.put('kill')

或者，您可以：
def read_chunk(file, buffer):
    return [file.readline() for _ in xrange(buffer)]
    # or, "more" equivalent to using islice
    #return [line for i,line in itertools.izip(xrange(buffer), file)]

def read(from_filename, buffer, process_queue):
  file = open(from_filename, 'r')

  for to_process in iter(lambda: read_chunk(file, buffer), []):
    process_queue.put(to_process)
  process_queue.put('kill')

请注意，如果必须构建列表，那么使用itertools.islice
是没有意义的