Python 为什么subprocess.Popen不等待子进程终止？_Python_Mysql

Python 为什么subprocess.Popen不等待子进程终止？

python mysql

Python 为什么subprocess.Popen不等待子进程终止？,python,mysql,Python,Mysql,Python的subprocess.Popen方法有问题下面是一个测试脚本，它演示了这个问题。它正在Linux机器上运行 #!/usr/bin/env python import subprocess import time def run(cmd): p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE) return p ### START MAIN # copy some rows from a source

Python的subprocess.Popen方法有问题

下面是一个测试脚本，它演示了这个问题。它正在Linux机器上运行

#!/usr/bin/env python
import subprocess
import time

def run(cmd):
  p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
  return p

### START MAIN
# copy some rows from a source table to a destination table
# note that the destination table is empty when this script is run
cmd = 'mysql -u ve --skip-column-names --batch --execute="insert into destination (select * from source limit 100000)" test'
run(cmd)

# check to see how many rows exist in the destination table
cmd = 'mysql -u ve --skip-column-names --batch --execute="select count(*) from destination" test'
process = run(cmd)
count = (int(process.communicate()[0][:-1]))

# if subprocess.Popen() waited for the child to terminate than count should be
# greater than 0
if count > 0:
  print "success: " + str(count)
else:
  print "failure: " + str(count)
  time.sleep(5)

  # find out how many rows exists in the destination table after sleeping
  process = run(cmd)
  count = (int(process.communicate()[0][:-1]))
  print "after sleeping the count is " + str(count)

此脚本的输出通常为：

success: 100000

但有时是

failure: 0
after sleeping the count is 100000

请注意，在失败的情况下，插入后立即选择显示0行，但在休眠5秒后，第二次选择正确显示100000行计数。我的结论是，以下情况之一是正确的：

Popen没有等待子线程终止-这似乎与文档相矛盾 mysql插入不是原子的——我对mysql的理解似乎表明插入是原子的 select没有立即看到正确的行数-据一位比我更了解mysql的朋友说，这种情况也不会发生我错过了什么

仅供参考，我知道这是一种从Python与mysql进行交互的黑客方式，MySQLdb可能不会有这个问题，但我很好奇为什么这个方法不起作用。

subprocess.Popen在实例化时运行该程序。但是，它不会等待它，它会在后台启动它，就像您在shell中键入cmd&一样。因此，在上面的代码中，您基本上定义了一个竞争条件——如果插入能够及时完成，它将显示为正常，但如果不能，您将获得意外的输出。您不必等待第一次运行的PID完成，只需返回其Popen实例并继续

我不确定这种行为与文档有何矛盾，因为Popen上有一些非常明确的方法似乎表明它没有等待，例如：

Popen.wait()
  Wait for child process to terminate. Set and return returncode attribute.

但是，我同意本模块的文档可以改进

为了等待程序完成，我建议在需要stdout时使用subprocess的便利方法subprocess.call，或者在Popen对象上使用communicate。您已经在为您的第二个呼叫执行此操作

### START MAIN
# copy some rows from a source table to a destination table
# note that the destination table is empty when this script is run
cmd = 'mysql -u ve --skip-column-names --batch --execute="insert into destination (select * from source limit 100000)" test'
subprocess.call(cmd)

# check to see how many rows exist in the destination table
cmd = 'mysql -u ve --skip-column-names --batch --execute="select count(*) from destination" test'
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
try: count = (int(process.communicate()[0][:-1]))
except: count = 0

此外，在大多数情况下，您不需要在shell中运行该命令。这是其中一种情况，但您必须像重写序列一样重写命令。这样做还可以避免传统的shell注入，减少对引用的担忧，例如：

prog = ["mysql", "-u", "ve", "--execute", 'insert into foo values ("snargle", 2)']
subprocess.call(prog)

这甚至会起作用，不会像您预期的那样注入：

prog = ["printf", "%s", "<", "/etc/passwd"]
subprocess.call(prog)

以交互方式尝试。您避免了shell注入的可能性，尤其是在接受用户输入的情况下。我怀疑您使用的是与子流程通信的不太可怕的字符串方法，因为您在让序列正常工作时遇到了问题：^

杜德，您为什么认为subprocess.Popen返回了一个带有方法的对象，除非是因为等待不是隐式的、固有的、即时的和不可避免的，正如你所猜测的。。。？！生成子进程最常见的原因不是立即等待它完成，而是让它继续运行，例如在另一个内核上，或者最坏的情况是在父进程继续运行的同时进行时间切片（这是操作系统和硬件的了望）；当父进程需要等待子进程完成时，它显然会调用原始subprocess.process call返回的对象上的wait。

如果您不一定需要使用subprocess和popen，那么使用os.system通常更简单。例如，对于快速脚本，我经常执行以下操作：

import os
run = os.system #convenience alias
result = run('mysql -u ve --execute="select * from wherever" test')

与popen不同，os.system会等待进程返回，然后再进入脚本的下一阶段

更多信息请参阅文档：

谢谢大家的精彩解答。再次查看子流程文档，我发现我被comment Wait for command to complete抛出，该命令出现在便利方法部分，而不是Popen方法部分。我同意Jed的回答，因为它最好地回答了我最初的问题，尽管我认为我将使用Paul的解决方案来满足我未来的脚本需求。请记住，os.system除非您对其执行其他操作，否则将返回进程的返回值，通常为0或1。我正在使用subprocess.call，它似乎也没有等待。随后的语句告诉代码删除它刚刚运行的文件，在代码运行之前它被调用，程序崩溃。