Linux 如何使用并发和等待限制执行多个命令？_Linux_Bash_Concurrency_Parallel Processing_Gnu Parallel

Linux 如何使用并发和等待限制执行多个命令？

linux bash concurrency parallel-processing

Linux 如何使用并发和等待限制执行多个命令？,linux,bash,concurrency,parallel-processing,gnu-parallel,Linux,Bash,Concurrency,Parallel Processing,Gnu Parallel,我想要实现的是这样的目标： #!/bin/sh concurrency_limit 3 #takes 5 min (/usr/bin/my-process-1 --args1 && /usr/bin/my-process-2 --args1) & #takes 10 min (/usr/bin/my-process-1 --args2 && /usr/bin/my-process-2 --args2) & #takes 15 min (/usr

我想要实现的是这样的目标：

#!/bin/sh
concurrency_limit 3

#takes 5 min
(/usr/bin/my-process-1 --args1 && /usr/bin/my-process-2 --args1) & 
#takes 10 min
(/usr/bin/my-process-1 --args2 && /usr/bin/my-process-2 --args2) &
#takes 15 min
(/usr/bin/my-process-1 --args3 && /usr/bin/my-process-2 --args3) &
#takes 5 min
(/usr/bin/my-process-1 --args4 && /usr/bin/my-process-2 --args4) &
#takes 10 min
(/usr/bin/my-process-1 --args5 && /usr/bin/my-process-2 --args5) &
#takes 20 min
(/usr/bin/my-process-1 --args6 && /usr/bin/my-process-2 --args6) &

wait max_limit 1200
echo all processes complete

总体预期最大执行时间为20分钟（-+1分钟），假设我有3个可用的cpu内核，并且我不希望同时运行3个以上的进程

在脚本开始时，启动了前3个进程

5分钟后：第一道工序完成，第四道工序开始

第10分钟：第2和第4道工序完成，第5道工序开始

第15分钟：第三道工序完成

第20分钟：第5道工序完成。第6个进程在没有进一步等待的情况下被终止

我对stackoverflow做了很多研究，但找不到类似的用法：

如有任何帮助或意见，将不胜感激

你可以用xargs来做。例如，下面的示例将使用3个并行进程对参数3、3、4、1、4和15运行函数“func”6次，并在10秒后将其终止：

function func  { echo args:$1; sleep $1; echo done; }
export -f func

function worktodo { echo -e 3\\n 3\\n 4\\n 1\\n 4\\n 15 | xargs -P 3 -I {} sh -c 'func "$@"' _ {}; }
export -f worktodo

timeout 10 sh -c "worktodo" || echo "timeout"

你可以用xargs来做。例如，下面的示例将使用3个并行进程对参数3、3、4、1、4和15运行函数“func”6次，并在10秒后将其终止：

function func  { echo args:$1; sleep $1; echo done; }
export -f func

function worktodo { echo -e 3\\n 3\\n 4\\n 1\\n 4\\n 15 | xargs -P 3 -I {} sh -c 'func "$@"' _ {}; }
export -f worktodo

timeout 10 sh -c "worktodo" || echo "timeout"

这里是一个框架，使用

SIGINT

在父进程和子进程之间进行通信

设置一个陷阱，计算有多少进程处于繁忙状态，当一个进程结束时，启动另一个进程：

trap '{ let Trapped++; }' INT  # start another child

将其初始化为要并行运行的数量：

Trapped=$ATONCE  # 3 in your case

然后根据需要循环并启动子项：

while true
do
  # Assuming there's more work to do. You need to decide when to terminate
  do_work &

  while [ $Trapped -le 0 ]
      wait         # race condition, interruptible by SIGINT
      local rc=$?  # ...
  done
done

然后在

dou_work

中，您需要类似以下内容：

call-external-process with parms

# Deal with problems
[[ $? -ne 0 ]] && { .... }

# Now tell parent we're done
kill -INT $$

那是个粗俗的想法。缺少的是如何知道何时没有更多的进程要启动，并且需要更好的错误处理，但希望您能够理解。将有3个进程始终在运行，一个新进程在一个进程结束时启动，直到无事可做。

这里是一个框架，使用

SIGINT

在父进程和子进程之间进行通信

设置一个陷阱，计算有多少进程处于繁忙状态，当一个进程结束时，启动另一个进程：

trap '{ let Trapped++; }' INT  # start another child

将其初始化为要并行运行的数量：

Trapped=$ATONCE  # 3 in your case

然后根据需要循环并启动子项：

while true
do
  # Assuming there's more work to do. You need to decide when to terminate
  do_work &

  while [ $Trapped -le 0 ]
      wait         # race condition, interruptible by SIGINT
      local rc=$?  # ...
  done
done

然后在

dou_work

中，您需要类似以下内容：

call-external-process with parms

# Deal with problems
[[ $? -ne 0 ]] && { .... }

# Now tell parent we're done
kill -INT $$

那是个粗俗的想法。缺少的是如何知道何时没有更多的进程要启动，并且需要更好的错误处理，但希望您能够理解。将有3个进程一直在运行，一个进程结束时会启动一个新进程，直到无事可做。

除非我错过了什么，否则我认为GNU Parallel将很容易为您做到这一点

如果制作一个名为

jobs

的文件，其中包含：

./my-process-1 --args1 && ./my-process-2 --args1
./my-process-1 --args2 && ./my-process-2 --args2
./my-process-1 --args3 && ./my-process-2 --args3
./my-process-1 --args4 && ./my-process-2 --args4
./my-process-1 --args5 && ./my-process-2 --args5
./my-process-1 --args6 && ./my-process-2 --args6

然后您可以看到使用

--dry run

GNU Parallel将做什么，如下所示：

parallel --dry-run -j 3 -k -a jobs

输出

./my-process-1 --args1 && ./my-process-2 --args1
./my-process-1 --args2 && ./my-process-2 --args2
./my-process-1 --args3 && ./my-process-2 --args3
./my-process-1 --args4 && ./my-process-2 --args4
./my-process-1 --args5 && ./my-process-2 --args5
./my-process-1 --args6 && ./my-process-2 --args6

如果

my-process-1

需要3秒，而

my-process-2

需要5秒，则整个过程需要16秒，因为前3行并行执行，每行需要8秒，接下来的3行并行执行，需要8秒

除非我遗漏了什么，否则我认为GNU Parallel将很容易为您做到这一点