进程未以python结尾
我有一个脚本来读取一个文件,这个文件可以是10s大小,我想使用多处理来处理它 这是一个压缩算法,我希望用户定义一个缓冲区,然后启动3个进程,一个从文件中读取缓冲区行数,将行传递给处理进程,然后将处理后的行传递给将行写入新文件的进程。我希望所有这些都同时发生,并且每个进程都等待下一批行 我已经有了脚本,但是当我运行它时,它还没有结束。我认为这些过程有问题。我认为这与我的read函数中的islice有关,但我不知道如何更好地编写它进程未以python结尾,python,multiprocessing,Python,Multiprocessing,我有一个脚本来读取一个文件,这个文件可以是10s大小,我想使用多处理来处理它 这是一个压缩算法,我希望用户定义一个缓冲区,然后启动3个进程,一个从文件中读取缓冲区行数,将行传递给处理进程,然后将处理后的行传递给将行写入新文件的进程。我希望所有这些都同时发生,并且每个进程都等待下一批行 我已经有了脚本,但是当我运行它时,它还没有结束。我认为这些过程有问题。我认为这与我的read函数中的islice有关,但我不知道如何更好地编写它 import multiprocessing as mp impor
import multiprocessing as mp
import time
from itertools import islice
def read(from_filename, buffer, process_queue):
file = open(from_filename, 'r')
slice = islice(file, buffer)
while slice:
to_process = []
for line in slice:
to_process.append(line)
process_queue.put(to_process)
process_queue.put('kill')
def write(to_filename, write_queue):
to_file = open(to_filename, 'a+')
while 1:
to_write = write_queue.get()
if to_write == 'kill':
break
to_file.write(to_write + '\n')
def compress(process_queue, write_queue):
while 1:
to_process = process_queue.get()
if to_process == 'kill':
write_queue.put('kill')
break
# process, put output in to_write
write_queue.put(to_write)
def decompress(process_queue, write_queue):
while 1:
to_process = process_queue.get()
if to_process == 'kill':
write_queue.put('kill')
break
# process, put output in to_write
write_queue.put(to_write)
def main():
option = raw_input("C for Compress OR D for Decompress: ")
from_file = raw_input("Enter input filename: ")
buf = int(raw_input("Enter line buffer: "))
to_file = raw_input("Enter output filename: ")
start = time.time()
write_queue = mp.Queue()
process_queue = mp.Queue()
reader = mp.Process(target=read, args=(from_file, buf, process_queue))
writer = mp.Process(target=write, args=(to_file, write_queue))
if option == 'c' or option == 'C':
processor = mp.Process(target=compress, args=(process_queue, write_queue))
elif option == 'd' or option == 'D':
processor = mp.Process(target=decompress, args=(process_queue, write_queue))
else:
print "Invalid Options..."
writer.start()
processor.start()
reader.start()
reader.join()
processor.join()
writer.join()
end = time.time()
elapsed = (end - start)
print "\n\nTotal Time Elapsed: " + str(elapsed) + " secs"
if __name__=='__main__':
main()
这是我第一次尝试多处理。
当我运行它时,它不会结束。我认为某个流程被卡住了。这段代码是错误的:
def read(from_filename, buffer, process_queue):
file = open(from_filename, 'r')
slice = islice(file, buffer)
while slice:
to_process = []
for line in slice:
to_process.append(line)
process_queue.put(to_process)
process_queue.put('kill')
由于slice
是一个islice
对象,因此当slice为真时,条件将始终为真,因此它就像在那里有一个while true
一样。每次都应该重新创建切片对象
def read(from_filename, buffer, process_queue):
file = open(from_filename, 'r')
while True:
slice = islice(file, buffer)
to_process = []
for line in slice:
to_process.append(line)
process_queue.put(to_process)
if not to_process:
# input ended
break
process_queue.put('kill')
或者,您可以:
def read_chunk(file, buffer):
return [file.readline() for _ in xrange(buffer)]
# or, "more" equivalent to using islice
#return [line for i,line in itertools.izip(xrange(buffer), file)]
def read(from_filename, buffer, process_queue):
file = open(from_filename, 'r')
for to_process in iter(lambda: read_chunk(file, buffer), []):
process_queue.put(to_process)
process_queue.put('kill')
请注意,如果必须构建列表,那么使用itertools.islice
是没有意义的