Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/314.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 主流程和子流程中的Popen_Python - Fatal编程技术网

Python 主流程和子流程中的Popen

Python 主流程和子流程中的Popen,python,Python,以下代码(在主线程中)运行良好,我对一些文件进行grep搜索,直到找到前100个结果(将结果写入文件),然后退出: command = 'grep -F "%s" %s*.txt' % (search_string, DATA_PATH) p = Popen(['/bin/bash', '-c', command], stdout = PIPE) f = open(output_file, 'w+') num_lines = MAX_RESULTS wh

以下代码(在主线程中)运行良好,我对一些文件进行grep搜索,直到找到前100个结果(将结果写入文件),然后退出:

    command = 'grep -F "%s" %s*.txt' % (search_string, DATA_PATH)

    p = Popen(['/bin/bash', '-c', command], stdout = PIPE)
    f = open(output_file, 'w+')
    num_lines = MAX_RESULTS
    while True:  
        line = p.stdout.readline()
        print num_lines
        if line != '':
            f.write(line)
        num_lines = num_lines - 1
        if num_lines == 0:
            break
        else:
            break
与流程子类中使用的代码相同,总是在控制台中返回
grep:writing output:break pipe

    class Search(Process):
        def __init__(self, search_id, search_string):
            self.search_id = search_id
            self.search_string = search_string  
            self.grepped = ''
            Process.__init__(self)

        def run(self):
            output_file = TMP_PATH + self.search_id

            # flag if no regex chars
            flag = '-F' if re.match(r"^[a-zA-Z0\ ]*$", self.search_string) else '-P'    

            command = 'grep %s "%s" %s*.txt' % (flag, self.search_string, DATA_PATH)

            p = Popen(['/bin/bash', '-c', command], stdout = PIPE)
            f = open(output_file, 'w+')
            num_lines = MAX_RESULTS
            while True:  
                line = p.stdout.readline()
                print num_lines
                if line != '':
                    f.write(line)
                num_lines = num_lines - 1
                if num_lines == 0:
                    break
                else:
                    break

为什么?如何修复此问题?

我可以像这样重现错误消息:

import multiprocessing as mp
import subprocess
import shlex

def worker():
    proc = subprocess.Popen(shlex.split('''
        /bin/bash -c "grep -P 'foo' /tmp/test.txt"
        '''), stdout = subprocess.PIPE)
    line = proc.stdout.readline()
    print(line)
    # proc.terminate()   # This fixes the problem

if __name__=='__main__':
    N = 6000
    with open('/tmp/test.txt', 'w') as f:
        f.write('bar foo\n'*N)   # <--- Increasing this number causes grep: writing output: Broken pipe
    p = mp.Process(target = worker)
    p.start()
    p.join()
对于stderr,它仍在处理的每行一次


修复方法是使用
proc.terminate()终止进程
worker
结束之前。

为什么要使用grep,而Python本身有非常可靠的解决方案?因为我必须搜索1.5 Gb以上的数据,而且grep的速度是Python无法比拟的。看起来和这里的问题一样:当我在捕获命令输出方面遇到问题时,我已经增加了缓冲区的大小:
p=Popen(['/bin/bash','-c',command],stdout=PIPE,bufsize=256*1024*1024)
如果我从grep本身读取了一个无限的while循环,为什么我的进程会在grep之前结束呢?你的while循环在一次迭代后会中断。
grep: writing output: Broken pipe