Linux 在后台bash中触发脚本时的依赖项检查
我在Linux 在后台bash中触发脚本时的依赖项检查,linux,bash,shell,Linux,Bash,Shell,我在bash脚本中有一个下面的代码块。在这段代码中,我并行触发脚本并存储该脚本的进程id 脚本名称位于名为session\u details.txt的文件中 文件内容如下 s01_test_abc s02_run_cde s02_skip_xyz s03_failed_123 s03_success_999 s04_done_111 Bash脚本代码 #!/bin/bash wf_name=$1 # Create logs directory if not exists for that
bash
脚本中有一个下面的代码块。在这段代码中,我并行触发脚本并存储该脚本的进程id
脚本名称位于名为session\u details.txt的文件中
文件内容如下
s01_test_abc
s02_run_cde
s02_skip_xyz
s03_failed_123
s03_success_999
s04_done_111
Bash脚本代码
#!/bin/bash
wf_name=$1
# Create logs directory if not exists for that workflow
mkdir -p logs/${wf_name}
# Create run_share directory if not exists for that workflow
mkdir -p run_share/${wf_name}
# Date on which the workflow is running
run_date=$(date '+%Y-%m-%d')
run_time=$(date '+%Y-%m-%d-%H-%M-%S')
# directory where current run status files are stored
share_dir=run_share/${wf_name}
# directory where current run logs are stored
logs_dir=logs/${wf_name}/${run_date}
number_of_jobs=0
PID_FILE=${logs_dir}/session_pid.txt
if [ -f ${PID_FILE} ]; then
rm ${PID_FILE}
fi
## parallel call #####
while read session; do
echo "processing started for ${session}"
session_name=$(echo ${session%% })
start_status_file=${share_dir}/${session_name}_start_${run_date}.txt
end_status_file=${share_dir}/${session_name}_end_${run_date}.txt
if [ -f ${start_status_file} ] & [ -f ${end_status_file} ] ; then
echo "session --> ${session_name} has completed run for ${run_date}"
elif [ -f ${start_status_file} ] ; then
echo "Session --> ${session_name} has failed run for ${run_date} so restarting the failed job"
sh ${wf_name}/${session_name}.sh ${wf_name} ${session_name} ${share_dir} ${run_date} ${logs_dir} > ${logs_dir}/${session_name}.log 2>&1 &
else
echo "Session --> ${session_name} has not run for ${run_date} so triggering new run"
sh ${wf_name}/${session_name}.sh ${wf_name} ${session_name} ${share_dir} ${run_date} ${logs_dir} > ${logs_dir}/${session_name}.log 2>&1 &
fi
#sh ${wf_name}/${session_name}.sh ${wf_name} > ${logs_dir}/$session_name.log 2>&1 &
pid=$!
echo "${pid}-${session_name}" >> ${PID_FILE}
echo "----------------"
if [ ${number_of_jobs} -eq 0 ] ; then
pid_list=${pid}
else
pid_list="${pid_list},${pid}"
fi
echo "PID list - ${pid_list}"
number_of_jobs=$((number_of_jobs + 1))
echo "${number_of_jobs} job(s) submitted so far"
echo "done"
done < ${wf_name}/session_details.txt
## wait for parallel jobs completion
PID_OUT_FILE=${logs_dir}/session_pid_out.txt
if [ -f ${PID_OUT_FILE} ]; then
rm ${PID_OUT_FILE}
fi
while read p; do
pid=$(echo ${p} | cut -d'-' -f1)
session_id=$(echo ${p} | cut -d'-' -f2)
wait ${pid}
echo $?"-"${p} >> ${PID_OUT_FILE}
done < ${PID_FILE}
## verify the pids completion status
while read res; do
pidres=$(echo ${res} | cut -d'-' -f1)
sess_id=$(echo ${res} | cut -d'-' -f3)
if [ ${pidres} -ne 0 ]; then
status_msg="FAILED"
echo "job failed for session -${sess_id} failed"
else
echo "job for ${sess_id} Completed"
status_msg="SUCCESS"
fi
done < ${PID_OUT_FILE}
## fail the script if any parallel run job failed.
while read res; do
pidres=$(echo ${res} | cut -d'-' -f1)
sess_id=$(echo ${res} | cut -d'-' -f3)
if [ ${pidres} -ne 0 ]; then
exit -1
fi
done < ${PID_OUT_FILE}
如何实现这一点您首先需要弄清楚如何表示所描述的依赖关系。如果所有命令都是不带参数的简单shell脚本,那么一个简单的解决方案是确定输入文件的每一行代表一组并行命令,这些命令必须在下一步之前成功完成。因此,不是:
s01_test_abc
s02_run_cde
s02_skip_xyz
s03_failed_123
s03_success_999
s04_done_111
你会写:
s01_test_abc
s02_run_cde s02_skip_xyz
s03_failed_123 s03_success_999
s04_done_111
有了它,您的基本逻辑如下所示:
#!/bin/bash
# read a line from the input file
while read set; do
pids=()
cmds=()
# start the commands and store pids in the pids array
for cmd in $set; do
echo "starting $cmd"
sh $cmd &
pids+=($!)
cmds+=($cmd)
done
echo "waiting for $set (${pids[@]})"
failed=0
# wait for each pid to finish, setting the `failed=1` if
# anything fails.
for i in $(seq ${#pids[*]}); do
if ! wait ${pids[i]}; then
echo "ERROR: command ${cmds[i]} (pid ${pids[i]}) failed" >&2
failed=1
fi
done
# if there were any failures, exit with an error.
if [[ $failed = 1 ]]; then
echo "ERROR: failure running $set" >&2
exit 1
fi
done < deps.txt
#!/bin/bash
sleep $((RANDOM % 10))
exit $((RANDOM % 2))
starting s01_test_abc
waiting for s01_test_abc (1085782)
ERROR: s01_test_abc (pid 1085782) failed
ERROR: failure running s01_test_abc
然后运行上面的脚本,我将得到如下输出
这:
或者像这样:
#!/bin/bash
# read a line from the input file
while read set; do
pids=()
cmds=()
# start the commands and store pids in the pids array
for cmd in $set; do
echo "starting $cmd"
sh $cmd &
pids+=($!)
cmds+=($cmd)
done
echo "waiting for $set (${pids[@]})"
failed=0
# wait for each pid to finish, setting the `failed=1` if
# anything fails.
for i in $(seq ${#pids[*]}); do
if ! wait ${pids[i]}; then
echo "ERROR: command ${cmds[i]} (pid ${pids[i]}) failed" >&2
failed=1
fi
done
# if there were any failures, exit with an error.
if [[ $failed = 1 ]]; then
echo "ERROR: failure running $set" >&2
exit 1
fi
done < deps.txt
#!/bin/bash
sleep $((RANDOM % 10))
exit $((RANDOM % 2))
starting s01_test_abc
waiting for s01_test_abc (1085782)
ERROR: s01_test_abc (pid 1085782) failed
ERROR: failure running s01_test_abc
或者,如果一切顺利:
starting s01_test_abc
waiting for s01_test_abc (1086268)
starting s02_run_cde
starting s02_skip_xyz
waiting for s02_run_cde s02_skip_xyz (1086269 1086270)
starting s03_failed_123
starting s03_success_999
waiting for s03_failed_123 s03_success_999 (1086271 1086272)
starting s04_done_111
waiting for s04_done_111 (1086273)
但是!您也可以简单地将依赖项表示为Makefile
…因为make
毕竟是一个可以并行运行的工具,可以告诉您不同步骤之间的依赖项:
SCRIPTS = \
s01_test_abc \
s02_skip_xyz \
s02_run_cde \
s03_success_999 \
s03_failed_123 \
s04_done_111
FLAGS = $(SCRIPTS:=.done)
# This is a "pattern rule" that tells Make how to generate a file
# named <something>.done from an input file named <something>.
%.done: %
@echo running $<
@sh $< && touch $@ || { echo "$< failed!"; exit 1; }
all: $(FLAGS)
# here is where we express our dependencies
s02_skip_xyz.done s02_run_cde.done: s01_test_abc.done
s03_success_999.done s03_failed_123.done: s02_skip_xyz.done s02_run_cde.done
s04_done_111.done: s03_success_999.done s03_failed_123.done
clean:
rm -f *.done
这为您提供了更结构化的格式来表示您的
依赖关系,它使您不必编写大量代码
这有效地以一种不太健壮的方式再现了相同的逻辑。您首先需要弄清楚如何表示所描述的依赖关系。如果所有命令都是不带参数的简单shell脚本,那么一个简单的解决方案是确定输入文件的每一行代表一组并行命令,这些命令必须在下一步之前成功完成。因此,不是:
s01_test_abc
s02_run_cde
s02_skip_xyz
s03_failed_123
s03_success_999
s04_done_111
你会写:
s01_test_abc
s02_run_cde s02_skip_xyz
s03_failed_123 s03_success_999
s04_done_111
有了它,您的基本逻辑如下所示:
#!/bin/bash
# read a line from the input file
while read set; do
pids=()
cmds=()
# start the commands and store pids in the pids array
for cmd in $set; do
echo "starting $cmd"
sh $cmd &
pids+=($!)
cmds+=($cmd)
done
echo "waiting for $set (${pids[@]})"
failed=0
# wait for each pid to finish, setting the `failed=1` if
# anything fails.
for i in $(seq ${#pids[*]}); do
if ! wait ${pids[i]}; then
echo "ERROR: command ${cmds[i]} (pid ${pids[i]}) failed" >&2
failed=1
fi
done
# if there were any failures, exit with an error.
if [[ $failed = 1 ]]; then
echo "ERROR: failure running $set" >&2
exit 1
fi
done < deps.txt
#!/bin/bash
sleep $((RANDOM % 10))
exit $((RANDOM % 2))
starting s01_test_abc
waiting for s01_test_abc (1085782)
ERROR: s01_test_abc (pid 1085782) failed
ERROR: failure running s01_test_abc
然后运行上面的脚本,我将得到如下输出
这:
或者像这样:
#!/bin/bash
# read a line from the input file
while read set; do
pids=()
cmds=()
# start the commands and store pids in the pids array
for cmd in $set; do
echo "starting $cmd"
sh $cmd &
pids+=($!)
cmds+=($cmd)
done
echo "waiting for $set (${pids[@]})"
failed=0
# wait for each pid to finish, setting the `failed=1` if
# anything fails.
for i in $(seq ${#pids[*]}); do
if ! wait ${pids[i]}; then
echo "ERROR: command ${cmds[i]} (pid ${pids[i]}) failed" >&2
failed=1
fi
done
# if there were any failures, exit with an error.
if [[ $failed = 1 ]]; then
echo "ERROR: failure running $set" >&2
exit 1
fi
done < deps.txt
#!/bin/bash
sleep $((RANDOM % 10))
exit $((RANDOM % 2))
starting s01_test_abc
waiting for s01_test_abc (1085782)
ERROR: s01_test_abc (pid 1085782) failed
ERROR: failure running s01_test_abc
或者,如果一切顺利:
starting s01_test_abc
waiting for s01_test_abc (1086268)
starting s02_run_cde
starting s02_skip_xyz
waiting for s02_run_cde s02_skip_xyz (1086269 1086270)
starting s03_failed_123
starting s03_success_999
waiting for s03_failed_123 s03_success_999 (1086271 1086272)
starting s04_done_111
waiting for s04_done_111 (1086273)
但是!您也可以简单地将依赖项表示为Makefile
…因为make
毕竟是一个可以并行运行的工具,可以告诉您不同步骤之间的依赖项:
SCRIPTS = \
s01_test_abc \
s02_skip_xyz \
s02_run_cde \
s03_success_999 \
s03_failed_123 \
s04_done_111
FLAGS = $(SCRIPTS:=.done)
# This is a "pattern rule" that tells Make how to generate a file
# named <something>.done from an input file named <something>.
%.done: %
@echo running $<
@sh $< && touch $@ || { echo "$< failed!"; exit 1; }
all: $(FLAGS)
# here is where we express our dependencies
s02_skip_xyz.done s02_run_cde.done: s01_test_abc.done
s03_success_999.done s03_failed_123.done: s02_skip_xyz.done s02_run_cde.done
s04_done_111.done: s03_success_999.done s03_failed_123.done
clean:
rm -f *.done
这为您提供了更结构化的格式来表示您的
依赖关系,它使您不必编写大量代码
这有效地以一种不太稳健的方式再现了相同的逻辑。谢谢你的回答。所有脚本都将参数
作为变量swell,然后您需要选项(b),生成文件
。这就更灵活了,而且很容易用参数调用脚本。我对上面回答中的bash
代码有一点怀疑,如果出现故障,您将打印出文件中行的PID
和set
。如果出现故障,是否有方法只打印我们传递的文件中的pid
和cmd
,我已经更新了shell脚本,以便它按照我认为您的要求执行(打印失败命令的名称,而不仅仅是pid)。是的,这是我想要的,但现在代码没有捕获任何故障。它是在完成整个脚本,而不是在失败时退出,然后向您询问答案。所有脚本都将参数
作为变量swell,然后您需要选项(b),生成文件
。这就更灵活了,而且很容易用参数调用脚本。我对上面回答中的bash
代码有一点怀疑,如果出现故障,您将打印出文件中行的PID
和set
。如果出现故障,是否有方法只打印我们传递的文件中的pid
和cmd
,我已经更新了shell脚本,以便它按照我认为您的要求执行(打印失败命令的名称,而不仅仅是pid)。是的,这是我想要的,但现在代码没有捕获任何故障。它正在完成整个脚本,而不是在出现故障时退出