Python subprocess.check\u输出与subprocess.call的性能_Python_Linux_Subprocess_Wine

Python subprocess.check\u输出与subprocess.call的性能

python linux

Python subprocess.check\u输出与subprocess.call的性能,python,linux,subprocess,wine,Python,Linux,Subprocess,Wine,我一直在使用子流程。检查\u output（）一段时间来捕获子流程的输出，但在某些情况下遇到了一些性能问题。我在RHEL6机器上运行这个调用Python的环境是linux编译的64位环境。我正在执行的子进程是一个shell脚本，它最终通过Wine启动了一个Windows python.exe进程（为什么需要这种愚蠢是另一回事）。作为shell脚本的输入，我正在导入一小段Python代码，并将其传递给Python.exe 当系统处于中/重负载（CPU利用率为40%到70%）时，我注意到使用子进程

我一直在使用

子流程。检查\u output（）

一段时间来捕获子流程的输出，但在某些情况下遇到了一些性能问题。我在RHEL6机器上运行这个

调用Python的环境是linux编译的64位环境。我正在执行的子进程是一个shell脚本，它最终通过Wine启动了一个Windows python.exe进程（为什么需要这种愚蠢是另一回事）。作为shell脚本的输入，我正在导入一小段Python代码，并将其传递给Python.exe

当系统处于中/重负载（CPU利用率为40%到70%）时，我注意到使用

子进程。check_output（cmd，shell=True）

可能会导致子进程在check_output命令返回之前完成执行后的显著延迟（最多45秒）。在此期间查看来自

ps-efH

的输出，将被调用的子流程显示为

sh

，直到它最终以正常的零退出状态返回

相反，使用

subprocess.call（cmd，shell=True）

在相同的中/重负载下运行相同的命令将导致子进程立即返回，没有延迟，所有输出都打印到STDOUT/STDERR（而不是从函数调用返回）

为什么只有当

check\u output（）

将STDOUT/STDERR输出重定向到其返回值时才会出现如此显著的延迟，而当

调用（）

只是将其打印回父级的STDOUT/STDERR时就不会出现这种延迟呢？

让我们看看代码。.check_输出具有以下等待：

    def _internal_poll(self, _deadstate=None, _waitpid=os.waitpid,
            _WNOHANG=os.WNOHANG, _os_error=os.error, _ECHILD=errno.ECHILD):
        """Check if child process has terminated.  Returns returncode
        attribute.

        This method is called by __del__, so it cannot reference anything
        outside of the local scope (nor can any methods it calls).

        """
        if self.returncode is None:
            try:
                pid, sts = _waitpid(self.pid, _WNOHANG)
                if pid == self.pid:
                    self._handle_exitstatus(sts)
            except _os_error as e:
                if _deadstate is not None:
                    self.returncode = _deadstate
                if e.errno == _ECHILD:
                    # This happens if SIGCLD is set to be ignored or
                    # waiting for child processes has otherwise been
                    # disabled for our process.  This child is dead, we
                    # can't get the status.
                    # http://bugs.python.org/issue15756
                    self.returncode = 0
        return self.returncode

.call使用以下代码等待：

    def wait(self):
        """Wait for child process to terminate.  Returns returncode
        attribute."""
        while self.returncode is None:
            try:
                pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
            except OSError as e:
                if e.errno != errno.ECHILD:
                    raise
                # This happens if SIGCLD is set to be ignored or waiting
                # for child processes has otherwise been disabled for our
                # process.  This child is dead, we can't get the status.
                pid = self.pid
                sts = 0
            # Check the pid and loop as waitpid has been known to return
            # 0 even without WNOHANG in odd situations.  issue14396.
            if pid == self.pid:
                self._handle_exitstatus(sts)
        return self.returncode

请注意，该bug与内部轮询相关。可在以下位置查看。这正是你遇到的问题

编辑：调用和.check\u输出之间的另一个潜在问题是。check\u输出实际上关心stdin和stdout，并将尝试对这两个管道执行IO。如果您正在运行一个进程，使其自身进入僵死状态，则可能是对处于失效状态的管道的读取导致了您正在经历的挂起

在大多数情况下，僵尸状态会很快被清除，但如果它们在系统调用（如读或写）时被中断，它们就不会被清除。当然，读/写系统调用本身应该在IO不能再执行时立即中断，但是，有可能您遇到了某种竞争条件，在这种情况下，事情会以错误的顺序被杀死

在这种情况下，我能想到的确定原因的唯一方法是向子流程文件中添加调试代码，或者在遇到问题时调用python调试器并启动回溯跟踪。

阅读文档，

subprocess.call

和

subprocess.check\u output

都是

subprocess.Popen

的用例。一个微小的区别是，如果子流程返回非零退出状态，

check\u output

将引发Python错误。更大的差异在关于<代码>检查输出的位中强调（我的重点）：

完整函数签名基本上与Popen构造函数的签名相同，只是不允许使用stdout，因为它是在内部使用的。所有其他提供的参数都直接传递给Popen构造函数

那么，

stdout

是如何“内部使用”的呢？让我们比较一下

调用

和

检查输出

：

呼叫检查输出沟通现在我们还要看一下Popen.communication。这样做时，我们注意到，对于一个管道，

communicate

做了几件事情，这比简单地返回

Popen（）.wait（）

要花费更多的时间，就像

call

做的那样

首先，

communicate

处理

stdout=PIPE

是否设置

shell=True

。显然，

call

没有。它只是让你的壳喷出任何东西。。。使其成为安全风险

其次，在

的情况下，检查_输出（cmd，shell=True）

（仅一个管道）。。。子进程发送到标准输出的任何内容都由

\u communicate

方法中的线程处理。并且

Popen

必须先加入线程（等待它），然后再等待子进程本身终止

此外，更简单的是，它将

stdout

作为

列表进行处理，然后必须将其连接到字符串中
简而言之，即使使用最小的参数，check\u output
在Python进程中花费的时间也比call
要多得多。
嗯，不完全是……bug注释指出受影响的代码将无限期挂起，然而，我的代码最终会在一个明显的延迟后返回。@Claris：如果进程退出，但其状态尚未被（其父进程）读取，则它就是一个僵尸。在这种情况下，sh
是一个僵尸，因为父python进程挂起p.stdout.read（）
调用，如果sh
生成继承其stdout的自己的子进程，则可能会发生这种调用，例如，调用（'（sleep 5；echo abc）&'，shell=True）
应立即返回，但检查输出（'（sleep 5；echo abc）&'，shell=True）
应该只在5秒钟后返回。@greenlaw:如果子进程挂起以进行调试，您是否尝试查看stacktrace？您是否在较新的Python版本上或在子进程32
模块上尝试过相同的代码，以查看异常延迟是否消失，即旧版本上是否存在错误？不，我没有，因为我的脚本需要几个仅适用于2.7.x的包。我曾试图在没有完整脚本的情况下重现这个问题，但至今仍无法实现。如果我可以在没有库依赖关系的情况下隔离和重现问题，我将尝试您的建议。subprocess32在Python 2.7（posix系统）上工作，我不知道
def call(*popenargs, **kwargs):
    return Popen(*popenargs, **kwargs).wait() 

def check_output(*popenargs, **kwargs):
    if 'stdout' in kwargs:
        raise ValueError('stdout argument not allowed, it will be overridden.')
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
    output, unused_err = process.communicate()
    retcode = process.poll()
    if retcode:
        cmd = kwargs.get("args")
        if cmd is None:
            cmd = popenargs[0]
        raise CalledProcessError(retcode, cmd, output=output)
    return output