Deep learning 如何将SLURM jobID作为输入参数传递给python?

Deep learning 如何将SLURM jobID作为输入参数传递给python?,deep-learning,python-3.6,slurm,Deep Learning,Python 3.6,Slurm,我对使用SLURM来训练一批卷积神经网络是个新手。为了方便地跟踪所有受过训练的CNN,我想将SLURM jobID作为输入参数传递给python。将其他变量作为参数传递可以正常工作。但是,我无法访问SLURM jobid以通过 我已经试过使用${SLURM\u JOBID},${SLURM\u JOB\u ID},%j和%j。在传入python之前,我还尝试将这些slurm env变量写入一个变量 这是我的最新代码: #!/bin/bash # --- info to user echo "s

我对使用SLURM来训练一批卷积神经网络是个新手。为了方便地跟踪所有受过训练的CNN,我想将SLURM jobID作为输入参数传递给python。将其他变量作为参数传递可以正常工作。但是,我无法访问SLURM jobid以通过

我已经试过使用
${SLURM\u JOBID}
${SLURM\u JOB\u ID}
%j
%j
。在传入python之前,我还尝试将这些slurm env变量写入一个变量

这是我的最新代码:

#!/bin/bash

# --- info to user
echo "script started ... "

# --- setup environment
module purge            # clean up
module load python/3.6
module load nvidia/10.0
module load cudnn/10.0-v7 

# --- display information
HOST=`hostname`
echo "This script runs the CNN. Slurm scheduled it on node $HOST"
echo "I am interested of all environment variables Slurm adds:"
env | grep -i slurm

# --- start running ... 
echo " --- run --- "

# --- define some varibles
dc="dice"
sm="softmax"

# --- run a job using a slurm batch script
for layer in {3..15..2}
  do
    sbatch -N 1 -n 1 --mem=20G --mail-type=END --gres=gpu:V100:3 --wrap="singularity --noslurm tensorflow_19.03-py3.simg python run_CNN_dynlayer.py ${SLURM_JOBID} ${layer} ${dc}"
    sleep 1 # pause 1s to be kind to the scheduler...
    echo "jobid: "+${SLURM_JOBID}
    echo " --- next --- "
  done    
cmd看起来是这样的:

femonk@rarp1 [CNN] ./run_CNN_test.slurm
script started ... 
This script runs the CNN. Slurm scheduled it on node rarp1
I am interested of all environment variables Slurm adds:
SLURM_ACCOUNT=AI
PYTHONPATH=/cluster/slurm/lib64/python3.6/site-packages:/cluster/slurm/lib64/python3.6/site-packages:/cluster/slurm/lib64/python3.6/site-packages:
 --- run --- 
Submitted batch job 3182711
jobid: 
 --- next --- 
femonk@rarp1 [CNN] 
有人知道我的代码出了什么问题吗?
非常感谢。

环境变量仅可用于作业流程,而不可用于提交作业的流程。作业id是从
sbatch
命令返回的,因此,如果希望将其包含在变量中,则需要将其赋值

  do
    SLURM_JOBID=$(sbatch --parsable -N 1 -n 1 --mem=20G --mail-type=END --gres=gpu:V100:3 --wrap="singularity --noslurm tensorflow_19.03-py3.simg python run_CNN_dynlayer.py ${SLURM_JOBID} ${layer} ${dc}")
    sleep 1 # pause 1s to be kind to the scheduler...
    echo "jobid: "+${SLURM_JOBID}
    echo " --- next --- "
  done   
注意将命令替换
$()
sbatch
--parsable
参数一起使用


还请注意,当前输出的
提交的批处理作业3182711
行将消失,因为它用于填充
SLURM_JOBID
变量。

就是这样,谢谢@damienfrancois!注意,“echo”jobid:“+${SLURM_jobid}”中的“+”似乎没有必要。