for循环bash脚本并行_Bash_For Loop_Parallel Processing

for循环bash脚本并行

bash for-loop parallel-processing

for循环bash脚本并行,bash,for-loop,parallel-processing,Bash,For Loop,Parallel Processing,我正在尝试一个循环脚本，如果可能的话，每个循环都并行运行 #!/bin/bash for ip in $(cat ./IPs); do ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log done 最终，我想把我需要的东西放在循环中，并让它多进程运行。我已经尝试了其他的例子，但似乎无法让它按预期工作。我也安装了parallel，如果可以的话。这似乎非常适合parallel： parallel 'ping -n -c 2 -W 1 "{}" &

我正在尝试一个循环脚本，如果可能的话，每个循环都并行运行

#!/bin/bash

for ip in $(cat ./IPs); do
ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log
done

最终，我想把我需要的东西放在循环中，并让它多进程运行。我已经尝试了其他的例子，但似乎无法让它按预期工作。我也安装了

parallel

，如果可以的话。

这似乎非常适合

parallel

：

parallel 'ping -n -c 2 -W 1 "{}" >>"Results/{}.log"' <IPs

parallel'ping-n-c2-w1“{}”>“Results/{}.log”这似乎非常适合parallel
：
parallel 'ping -n -c 2 -W 1 "{}" >>"Results/{}.log"' <IPs

parallel'ping-n-c2-w1“{}>>“Results/{}.log”以获取此文件的简单版本-
while read ip
do  ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log 2>&1 &
done < IPs

我在阅读时也切换到了，以避免在for
中使用cat
，但这主要是风格偏好
要获得更具负载意识的版本，请使用wait。
我制作了一个简单的控制文件，每行只有一个字母-
$: cat x
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z

然后声明了几个值-一个我希望它立即触发的最大值和一个计数器
$: declare -i cnt=0 max=10

然后我输入一个read循环来迭代这些值，并一次运行一个集合。在累积到规定的最大值之前，它会不断在后台添加进程并计数。一旦得到足够的数据，它将等待这些数据完成并重置计数器，然后再继续使用另一组数据
$: while read ctl             # these would be your IP's
> do if (( cnt++ < max ))     # this checks for max load
>    then echo starting $ctl  # report which we're doing
>         date                # throw a timestamp
>         sleep 10 &          # and fire the task in background
>    else echo letting that batch work... # when too many running
>         cnt=0               # reset the counter
>         wait                # and thumb-twiddle till they all finish
>         echo continuing     # log
>         date                # and timestamp
>    fi
> done < x                    # the whole loop reads from x until done

当完成后，最后几个脚本仍在运行，因为我没有费劲地将所有这些都写入一个经过仔细检查的实际脚本
$: ps
      PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
    11012   10944   11012      11040  pty0     2136995 07:59:35 /usr/bin/bash
     6436   11012    6436       9188  pty0     2136995 08:13:56 /usr/bin/sleep
     5520   11012    5520      10064  pty0     2136995 08:13:56 /usr/bin/sleep
    12216   11012   12216      12064  pty0     2136995 08:13:57 /usr/bin/sleep
     8468   11012    8468      10100  pty0     2136995 08:13:57 /usr/bin/sleep
     9096   11012    9096      10356  pty0     2136995 08:14:03 /usr/bin/ps

这确实会导致突发负载（对于并非所有任务都在同一时间完成的任务）减少，直到最后一次完成，从而导致峰值和间歇。再巧妙一点，我们就可以编写一个waitpid
陷阱，每次完成一个任务时都会触发一个新任务，以保持负载稳定，但这是另一天的练习，除非有人真的想看到它。（我以前是用Perl实现的，一直想在bash中实现它，因为…）
因为它是被要求的-
显然，正如其他文章所述，您可以使用parallel
。。。但作为练习，这里有一种方法可以设置许多从队列读取的进程链。我选择了简单的回调，而不是处理SIGCHLD陷阱，因为有很多小的子程序在运行
如果有人关心，欢迎改进
#! /bin/env bash

trap 'echo abort $0@$LINENO; die; exit 1' ERR       # make sure any error is fatal
declare -i primer=0          # a countdown of how many processes to pre-spawn
use="
  $0 <#procs> <cmdfile>

  Pass the number of desired processes to prespawn as the 1st argument.
  Pass the command file with the list of tasks you need done.

  Command file format:
   KEYSTRING:cmdlist

  where KEYSTRING will be used as a unique logfile name
  and   cmdlist   is the base command string to be run

"

die() {
   echo "$use" >&2
   return 1
}

case $# in
2) primer=$1
   case "$primer" in
   *[^0-9]*) echo "INVALID #procs '$primer'"
             die;;
   esac
   cmdfile=$2
   [[ -r "$cmdfile" ]] || die
   declare -i lines=$( grep -c . $cmdfile)
   if (( lines < primer ))
   then echo "Note - command lines in $cmdfile ($lines) fewer than requested process chains ($primer)"
        die
   fi ;;
*) die ;;
esac >&2

trap ': no-op to ignore' HUP  # ignore hangups (built-in nohup without explicit i/o redirection)

spawn() {
  IFS="$IFS:" read key cmd || return
  echo "$(date) executing '$cmd'; c.f. $key.log" | tee $key.log
  echo "# autogenerated by $0 $(date)
   { $cmd
     spawn
   } >> $key.log 2>&1 &
  " >| $key.sh
  . $key.sh
  rm -f $key.sh
  return 0
}

while (( primer-- ))  # until we've filled the requested quota
do spawn              # create a child process
done < $cmdfile

注意，第一个甚至在后台运行-后台处理程序不关心。作业a将在spool
can之前启动b，因此它将跳到c
一些日志-
a-原始卵；在后台运行，并立即启动b，然后继续记录
b-快速退出并启动f，因为c、d和e已运行
c-原始卵；在b之前完成，所以它开始了d，这就是为什么b开始了f
d-由c开始，完成并启动h，因为g已经运行
e-原始繁殖，启动n，因为所有的一切都已运行
（向前跳了一点…）
n-由e启动，花了足够长的时间来完成，没有更多的任务要启动
它起作用了。它并不完美，但可能很方便。：）
 以获取此的简单版本-
while read ip
do  ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log 2>&1 &
done < IPs

我在阅读时也切换到了，以避免在for
中使用cat
，但这主要是风格偏好
要获得更具负载意识的版本，请使用wait。
我制作了一个简单的控制文件，每行只有一个字母-
$: cat x
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z

然后声明了几个值-一个我希望它立即触发的最大值和一个计数器
$: declare -i cnt=0 max=10

然后我输入一个read循环来迭代这些值，并一次运行一个集合。在累积到规定的最大值之前，它会不断在后台添加进程并计数。一旦得到足够的数据，它将等待这些数据完成并重置计数器，然后再继续使用另一组数据
$: while read ctl             # these would be your IP's
> do if (( cnt++ < max ))     # this checks for max load
>    then echo starting $ctl  # report which we're doing
>         date                # throw a timestamp
>         sleep 10 &          # and fire the task in background
>    else echo letting that batch work... # when too many running
>         cnt=0               # reset the counter
>         wait                # and thumb-twiddle till they all finish
>         echo continuing     # log
>         date                # and timestamp
>    fi
> done < x                    # the whole loop reads from x until done

当完成后，最后几个脚本仍在运行，因为我没有费劲地将所有这些都写入一个经过仔细检查的实际脚本
$: ps
      PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
    11012   10944   11012      11040  pty0     2136995 07:59:35 /usr/bin/bash
     6436   11012    6436       9188  pty0     2136995 08:13:56 /usr/bin/sleep
     5520   11012    5520      10064  pty0     2136995 08:13:56 /usr/bin/sleep
    12216   11012   12216      12064  pty0     2136995 08:13:57 /usr/bin/sleep
     8468   11012    8468      10100  pty0     2136995 08:13:57 /usr/bin/sleep
     9096   11012    9096      10356  pty0     2136995 08:14:03 /usr/bin/ps

这确实会导致突发负载（对于并非所有任务都在同一时间完成的任务）减少，直到最后一次完成，从而导致峰值和间歇。再巧妙一点，我们就可以编写一个waitpid
陷阱，每次完成一个任务时都会触发一个新任务，以保持负载稳定，但这是另一天的练习，除非有人真的想看到它。（我以前是用Perl实现的，一直想在bash中实现它，因为…）
因为它是被要求的-
显然，正如其他文章所述，您可以使用parallel
。。。但作为练习，这里有一种方法可以设置许多从队列读取的进程链。我选择了简单的回调，而不是处理SIGCHLD陷阱，因为有很多小的子程序在运行
如果有人关心，欢迎改进
#! /bin/env bash

trap 'echo abort $0@$LINENO; die; exit 1' ERR       # make sure any error is fatal
declare -i primer=0          # a countdown of how many processes to pre-spawn
use="
  $0 <#procs> <cmdfile>

  Pass the number of desired processes to prespawn as the 1st argument.
  Pass the command file with the list of tasks you need done.

  Command file format:
   KEYSTRING:cmdlist

  where KEYSTRING will be used as a unique logfile name
  and   cmdlist   is the base command string to be run

"

die() {
   echo "$use" >&2
   return 1
}

case $# in
2) primer=$1
   case "$primer" in
   *[^0-9]*) echo "INVALID #procs '$primer'"
             die;;
   esac
   cmdfile=$2
   [[ -r "$cmdfile" ]] || die
   declare -i lines=$( grep -c . $cmdfile)
   if (( lines < primer ))
   then echo "Note - command lines in $cmdfile ($lines) fewer than requested process chains ($primer)"
        die
   fi ;;
*) die ;;
esac >&2

trap ': no-op to ignore' HUP  # ignore hangups (built-in nohup without explicit i/o redirection)

spawn() {
  IFS="$IFS:" read key cmd || return
  echo "$(date) executing '$cmd'; c.f. $key.log" | tee $key.log
  echo "# autogenerated by $0 $(date)
   { $cmd
     spawn
   } >> $key.log 2>&1 &
  " >| $key.sh
  . $key.sh
  rm -f $key.sh
  return 0
}

while (( primer-- ))  # until we've filled the requested quota
do spawn              # create a child process
done < $cmdfile

注意，第一个甚至在后台运行-后台处理程序不关心。作业a将在spool
can之前启动b，因此它将跳到c
一些日志-
a-原始卵；在后台运行，并立即启动b，然后继续记录
b-快速退出并启动f，因为c、d和e已运行
c-原始卵；在b之前完成，所以它开始了d，这就是为什么b开始了f
d-由c启动，完成并启动h，因为g已经运行
e-原始繁殖，启动n，因为所有的一切都已运行
（向前跳了一点…）
n-由e启动，花了足够长的时间来完成，没有更多的任务要启动
它起作用了。它并不完美，但可能很方便。：） 如果另一个问题来自不同的域，则无法将其作为重复关闭。不直接相关，但请参阅关于逐行读取文件。或者，如果另一个问题来自不同的域，则无法将其作为重复关闭。不直接相关，但请参阅关于逐行读取文件。或者“2>&1”是什么
Thu, Oct 25, 2018  2:33:58 PM executing 'date;sleep 5;date'; c.f. e.log
Thu, Oct 25, 2018  2:33:58 PM
Thu, Oct 25, 2018  2:34:04 PM
Thu, Oct 25, 2018  2:34:04 PM executing 'date;sleep 19;date'; c.f. n.log

Thu, Oct 25, 2018  2:34:04 PM executing 'date;sleep 19;date'; c.f. n.log
Thu, Oct 25, 2018  2:34:04 PM
Thu, Oct 25, 2018  2:34:23 PM