Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/bash/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
分叉/多线程进程| Bash_Bash_Shell_Fork - Fatal编程技术网

分叉/多线程进程| Bash

分叉/多线程进程| Bash,bash,shell,fork,Bash,Shell,Fork,我想让我的代码中的一部分更有效率。我正在考虑将它分成多个进程,让它们同时执行50/100次,而不是只执行一次 例如(伪): 我希望这个for循环运行多次。我知道用叉子可以做到这一点。它看起来像这样吗 while(x <= 50) parent(child pid) { fork child() } child { do foo; foo2; foo3; done return child_pid() } while(x让我试试这个例子 for x in

我想让我的代码中的一部分更有效率。我正在考虑将它分成多个进程,让它们同时执行50/100次,而不是只执行一次

例如(伪):

我希望这个for循环运行多次。我知道用叉子可以做到这一点。它看起来像这样吗

while(x <= 50)
parent(child pid)
{
   fork child()
}
child
{
   do 
   foo; foo2; foo3; 
   done
   return child_pid()
}
while(x让我试试这个例子

for x in 1 2 3 ; do { echo a $x ; sleep 1 ; echo b $x ; } &  done ; sleep 10

使用
jobs
查看运行的是什么。

我不知道bash中有任何显式的
fork
调用。您可能想做的是追加
&
要在后台运行的命令。也可以对bash脚本中定义的函数使用
&

do_something_with_line()
{
  line=$1
  foo
  foo2
  foo3
}

for line in file
do
  do_something_with_line $line &
done
编辑:要限制同步后台进程的数量,可以尝试以下方法:

for line in file
do
  while [`jobs | wc -l` -ge 50 ]
  do
    sleep 5
  done
  do_something_with_line $line &
done
在bash脚本(非交互式)中,默认情况下禁用作业控制,因此无法执行以下命令:JOB、fg和bg

以下是对我有效的方法:

#!/bin/sh

set -m # Enable Job Control

for i in `seq 30`; do # start 30 jobs in parallel
  sleep 3 &
done

# Wait for all parallel jobs to finish
while [ 1 ]; do fg 2> /dev/null; [ $? == 1 ] && break; done

最后一行使用“fg”将后台作业带到前台。它在循环中执行此操作,直到fg返回1($?==1),当不再有任何后台作业时,它会执行此操作。

使用GNU Parallel,您可以执行以下操作:

cat file | parallel 'foo {}; foo2 {}; foo3 {}'
这将在每个cpu内核上运行一个作业。要运行50,请执行以下操作:

cat file | parallel -j 50 'foo {}; foo2 {}; foo3 {}'
观看介绍视频了解更多信息:


基于大家分享的内容,我能够将这些内容组合在一起:

#!/usr/bin/env bash

VAR1="192.168.1.20 192.168.1.126 192.168.1.36"

for a in $VAR1; do { ssh -t -t $a -l Administrator "sudo softwareupdate -l"; } & done;
WAITPIDS="$WAITPIDS "$!;...; wait $WAITPIDS
echo "Script has finished"

Exit 1

这将同时列出三台机器上mac上的所有更新。稍后,当我键入我的ipaddress.txt时,我使用它对所有机器执行软件更新。以下是我的线程控制功能:

#!/bin/bash
# This function just checks jobs in background, don't do more things.
# if jobs number is lower than MAX, then return to get more jobs;
# if jobs number is greater or equal to MAX, then wait, until someone finished.

# Usage:
#   thread_max 8
#   thread_max 0    # wait, until all jobs completed

thread_max() {
    local CHECK_INTERVAL="3s"
    local CUR_THREADS=
    local MAX=
    [[ $1 ]] && MAX=$1 || return 127

    # reset MAX value, 0 is easy to remember
    [ $MAX -eq 0 ] && {
        MAX=1
        DEBUG "waiting for all tasks finish"
    }

    while true; do
        CUR_THREADS=`jobs -p | wc -w`

        # workaround about jobs bug. If don't execute it explicitily,
        # CUR_THREADS will stick at 1, even no jobs running anymore.
        jobs &>/dev/null

        DEBUG "current thread amount: $CUR_THREADS"
        if [ $CUR_THREADS -ge $MAX ]; then
            sleep $CHECK_INTERVAL
        else
            return 0
        fi
    done
}
addPid() {
    local desc=$1
    local pid=$2
    echo "$desc -- $pid"
    pids=(${pids[@]} $pid)
}

我不喜欢使用
wait
,因为它在进程退出之前会被阻塞,这在有多个进程等待的情况下并不理想,因为在当前进程完成之前我无法获得状态更新。对此,我更喜欢使用
kill-0
sleep
的组合

给定要等待的
pid
数组,我使用下面的
waitPids()
函数获取关于哪些pid仍有待完成的连续反馈

declare -a pids
waitPids() {
    while [ ${#pids[@]} -ne 0 ]; do
        echo "Waiting for pids: ${pids[@]}"
        local range=$(eval echo {0..$((${#pids[@]}-1))})
        local i
        for i in $range; do
            if ! kill -0 ${pids[$i]} 2> /dev/null; then
                echo "Done -- ${pids[$i]}"
                unset pids[$i]
            fi
        done
        pids=("${pids[@]}") # Expunge nulls created by unset.
        sleep 1
    done
    echo "Done!"
}
当我在后台启动一个进程时,我使用下面的实用程序函数将其pid立即添加到
pid
数组中:

#!/bin/bash
# This function just checks jobs in background, don't do more things.
# if jobs number is lower than MAX, then return to get more jobs;
# if jobs number is greater or equal to MAX, then wait, until someone finished.

# Usage:
#   thread_max 8
#   thread_max 0    # wait, until all jobs completed

thread_max() {
    local CHECK_INTERVAL="3s"
    local CUR_THREADS=
    local MAX=
    [[ $1 ]] && MAX=$1 || return 127

    # reset MAX value, 0 is easy to remember
    [ $MAX -eq 0 ] && {
        MAX=1
        DEBUG "waiting for all tasks finish"
    }

    while true; do
        CUR_THREADS=`jobs -p | wc -w`

        # workaround about jobs bug. If don't execute it explicitily,
        # CUR_THREADS will stick at 1, even no jobs running anymore.
        jobs &>/dev/null

        DEBUG "current thread amount: $CUR_THREADS"
        if [ $CUR_THREADS -ge $MAX ]; then
            sleep $CHECK_INTERVAL
        else
            return 0
        fi
    done
}
addPid() {
    local desc=$1
    local pid=$2
    echo "$desc -- $pid"
    pids=(${pids[@]} $pid)
}
下面是一个示例,演示如何使用:

for i in {2..5}; do
    sleep $i &
    addPid "Sleep for $i" $!
done
waitPids
以下是反馈的外观:

Sleep for 2 -- 36271
Sleep for 3 -- 36272
Sleep for 4 -- 36273
Sleep for 5 -- 36274
Waiting for pids: 36271 36272 36273 36274
Waiting for pids: 36271 36272 36273 36274
Waiting for pids: 36271 36272 36273 36274
Done -- 36271
Waiting for pids: 36272 36273 36274
Done -- 36272
Waiting for pids: 36273 36274
Done -- 36273
Waiting for pids: 36274
Done -- 36274
Done!

haridsv的方法非常好,它提供了运行处理器插槽设置的灵活性,在该设置中,许多进程可以在新作业提交为作业完成时保持运行,从而保持总体负载。以下是我对haridsv针对ngrid“作业”网格的n插槽处理器的代码的修改(我将其用于模拟模型网格)然后是8个作业的测试输出,每次3个,运行总数为运行、提交、完成和剩余

#!/bin/bash
########################################################################
# see haridsv on forking-multi-threaded-processes-bash
# loop over grid, submitting jobs in the background.
# As jobs complete new ones are set going to keep the number running
# up to n as much as possible, until it tapers off at the end.
#
# 8 jobs
ngrid=8
# 3 at a time
n=3
# running counts
running=0
completed=0
# previous values
prunning=0
pcompleted=0
#
########################################################################
# process monitoring functions
#
declare -a pids
#
function checkPids() {
echo  ${#pids[@]}
if [ ${#pids[@]} -ne 0 ]
then
    echo "Checking for pids: ${pids[@]}"
    local range=$(eval echo {0..$((${#pids[@]}-1))})
    local i
    for i in $range; do
        if ! kill -0 ${pids[$i]} 2> /dev/null; then
            echo "Done -- ${pids[$i]}"
            unset pids[$i]
            completed=$(expr $completed + 1)
        fi
    done
    pids=("${pids[@]}") # Expunge nulls created by unset.
    running=$((${#pids[@]}))
    echo "#PIDS :"$running
fi
}
#
function addPid() {
    desc=$1
    pid=$2
    echo " ${desc} - "$pid
    pids=(${pids[@]} $pid)
}
########################################################################
#
# Loop and report when job changes happen,
# keep going until all are completed.
#
idx=0
while [ $completed -lt ${ngrid} ]
do
#
    if [ $running -lt $n ] && [ $idx -lt ${ngrid} ]
    then
####################################################################
#
# submit a new process if less than n
# are running and we haven't finished...
#
# get desc for process
#
        name="job_"${idx}
# background execution
        sleep 3 &
        addPid $name $!
        idx=$(expr $idx + 1)
#
####################################################################
#
    fi
#
    checkPids
# if something changes...
    if [ ${running} -gt ${prunning} ] || \
       [ ${completed} -gt ${pcompleted} ]
    then
        remain=$(expr $ngrid - $completed)
        echo  " Running: "${running}" Submitted: "${idx}\
              " Completed: "$completed" Remaining: "$remain
    fi
# save counts to prev values
    prunning=${running}
    pcompleted=${completed}
#
    sleep 1
#
done
#
########################################################################
测试输出:

 job_0 - 75257
1
Checking for pids: 75257
#PIDS :1
 Running: 1 Submitted: 1  Completed: 0 Remaining: 8
 job_1 - 75262
2
Checking for pids: 75257 75262
#PIDS :2
 Running: 2 Submitted: 2  Completed: 0 Remaining: 8
 job_2 - 75267
3
Checking for pids: 75257 75262 75267
#PIDS :3
 Running: 3 Submitted: 3  Completed: 0 Remaining: 8
3
Checking for pids: 75257 75262 75267
Done -- 75257
#PIDS :2
 Running: 2 Submitted: 3  Completed: 1 Remaining: 7
 job_3 - 75277
3
Checking for pids: 75262 75267 75277
Done -- 75262
#PIDS :2
 Running: 2 Submitted: 4  Completed: 2 Remaining: 6
 job_4 - 75283
3
Checking for pids: 75267 75277 75283
Done -- 75267
#PIDS :2
 Running: 2 Submitted: 5  Completed: 3 Remaining: 5
 job_5 - 75289
3
Checking for pids: 75277 75283 75289
#PIDS :3
 Running: 3 Submitted: 6  Completed: 3 Remaining: 5
3
Checking for pids: 75277 75283 75289
Done -- 75277
#PIDS :2
 Running: 2 Submitted: 6  Completed: 4 Remaining: 4
 job_6 - 75298
3
Checking for pids: 75283 75289 75298
Done -- 75283
#PIDS :2
 Running: 2 Submitted: 7  Completed: 5 Remaining: 3
 job_7 - 75304
3
Checking for pids: 75289 75298 75304
Done -- 75289
#PIDS :2
 Running: 2 Submitted: 8  Completed: 6 Remaining: 2
2
Checking for pids: 75298 75304
#PIDS :2
2
Checking for pids: 75298 75304
Done -- 75298
#PIDS :1
 Running: 1 Submitted: 8  Completed: 7 Remaining: 1
1
Checking for pids: 75304
Done -- 75304
#PIDS :0
 Running: 0 Submitted: 8  Completed: 8 Remaining: 0

你把dou_something…name;-)的资金分配错了,明白了吗?如果我想确保一次只运行50个实例,那该怎么办?并且-当其中一个过程完成时,确保多产生1个。啊,是的-我没有看到你答案的最后一行。非常感谢你。我要去工作了。你问了我之后,我又加了一行,所以你没能读懂我的心思也没关系;-)(就像你问我之前我没读你的一样:))。顺便说一句,
manbash
是关于作业控制信息的一个重要来源。一旦你走上这条路,你可能会有很多问题;-)+1名暴徒。我将其修改为一个函数,在您在后台输入任何内容后,您可以将其添加到命令文件中。然后您可以在文件中按顺序运行一些命令,并且只能在后台运行其中的一些命令:#/bin/bash waitpid(){while[[code>jobs | wc-l
-ge$1]];do sleep 1;done;}在bash脚本中,您可以使用
wait
,例如:
sleep 3&waitpid=$!;等待$WAITPID
,或以这种方式将PID凹陷
WAITPIDS=“$WAITPIDS”$!;。。。;等等$WAITPIDS
我如何做1000件事,一次50件?在say
$(seq 1 1000)
的循环中,我曾尝试在FreeBSD上使用/bin/sh,但它被困在while循环中。seq命令支持步进增量:
$(seq 1 50 1000)
但在每个循环@chovy
中,由您来做50件事/bin/sh
生产
/test.sh:10:[:2:意外的运算符
并陷入无限循环。使用
#!/bin/bash
修复此问题,我想补充一点,大多数系统上已经安装了并行程序。我的OS X 10.8.5机器上有此功能。是时候清理shell脚本上的蜘蛛网,并将for循环更新为并行程序了……使用搜索/替换时,这看起来非常混乱非常小的、无关紧要的改进:使用
local range=$(eval echo{0..$(${pids[@]}-1))
比使用内置的
for i in${!pids[@]}
慢得多${!在关联数组中更为常见,但在基本数组中,所有索引都会很好地显示出来,至少可以追溯到Bash 4.1。