python'；s subprocess.Popen跳过输入_Python_Subprocess

python'；s subprocess.Popen跳过输入

python

python'；s subprocess.Popen跳过输入,python,subprocess,Python,Subprocess,我发现subprocess.Popen（）会在特定场景中跳过输入字节。为了演示这个问题，我编写了以下（荒谬的）程序：此程序跳过指定数量的输入字节，然后向外发送到wc，以计算剩余的字节数现在，使用dd尝试该程序以生成输入： # skipping 0, everything works fine: $ dd if=/dev/zero bs=1 count=100 2>/dev/null | python wc.py 0 100 $ # but skipping more than 0 y

我发现subprocess.Popen（）会在特定场景中跳过输入字节。为了演示这个问题，我编写了以下（荒谬的）程序：

此程序跳过指定数量的输入字节，然后向外发送到

wc

，以计算剩余的字节数

现在，使用

dd

尝试该程序以生成输入：

# skipping 0, everything works fine:
$ dd if=/dev/zero bs=1 count=100 2>/dev/null | python wc.py 0
100

$ # but skipping more than 0 yields an unexpected result.
$ # this should return 99:
$ dd if=/dev/zero bs=1 count=100 2>/dev/null | python wc.py 1
0

$ # I noticed it skips up to the 4k boundary.
$ # this should return 8191:
$ dd if=/dev/zero bs=1 count=8192 2>/dev/null | python wc.py 1
4096

有人能解释这种意外的行为吗？一个已知的问题？应该归档的bug？“你做错了吗？”

FWIW，我最终通过使用stdin的管道解决了这个问题，然后一次输入一个数据块：

p = Popen(cmd, stdin=PIPE)
chunk = fin.read(CHUNK_SIZE)
while chunk:
    p.stdin.write(chunk)
    chunk = fin.read(CHUNK_SIZE)
p.stdin.close()
p.wait()

sys.stdin

上的

.read（）

函数缓冲在Python中。因此，当您读取一个字节时，Python实际上读取了整个缓冲区，期望您很快会再次执行相同的操作。但是，读取缓冲区已满（在您的情况下为4096）意味着操作系统认为输入已被读取，不会将其传递给

wc

通过使用跳过必需的输入字节数，可以避免此问题。这将直接调用操作系统，并且不会在进程中缓冲数据：

os.read(fin.fileno(), skip)

太好了，我没有意识到

os.read（）

。你的回答让我意识到，通过使用

os.fdopen（）

：

fin=os.fdopen（sys.stdin.fileno（），'r'，0）

os.read(fin.fileno(), skip)