Python环境和Slurm（sbatch）存在问题_Python_Hpc_Slurm_Sbatch

Python环境和Slurm（sbatch）存在问题

python

Python环境和Slurm（sbatch）存在问题,python,hpc,slurm,sbatch,Python,Hpc,Slurm,Sbatch,当我尝试在SLURM HPC集群上运行批处理作业时，遇到了一个问题 .py文件在我的HPC用户目录中运行良好。但是，当我使用sbatch命令在GPU上运行它时，它会抛出： slurmstepd:错误：execve（）：/var/spool/slurm/d/job401100/slurm\u脚本：不是目录我的Python版本： $ python -V Python 2.7.5 $ python3 -V Python 3.6.8 test.py在python3上运行 test.py中的Pyt

当我尝试在SLURM HPC集群上运行批处理作业时，遇到了一个问题

.py文件在我的HPC用户目录中运行良好。但是，当我使用sbatch命令在GPU上运行它时，它会抛出：

slurmstepd:错误：execve（）：/var/spool/slurm/d/job401100/slurm\u脚本：不是目录

我的Python版本：

$ python -V
Python 2.7.5
$ python3 -V
Python 3.6.8

test.py在python3上运行

test.py中的Python代码示例：

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.metrics import accuracy_score, confusion_matrix,classification_report
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.naive_bayes import MultinomialNB

true = pd.read_csv("testfile.csv")
print('Just Testing. End for now.')

gpu.job：

#!/bin/bash/

#SBATCH --job-name=testgpu       # Job name
#SBATCH --output=job.%j.out      # Name of output file (%j expands to jobId)
#SBATCH --cpus-per-task=4        # Schedule one core
#SBATCH --gres=gpu               # Schedule a GPU
#SBATCH --time=71:59:59          # Run time (hh:mm:ss) - run for one hour max
#SBATCH --partition=red          # Run on either the Red or Brown queue
#SBATCH --mail-type=END          # Send an email when the job finishes
#SBATCH --export=ALL             # All of the users environment will be loaded from callers environment


python3 test.py

我知道，当slurm执行脚本时，它会在自己的slurm目录和环境中执行——除非我特别告诉它，否则它对我的环境一无所知。但是，将--export=ALL“包含到作业文件中没有帮助

如何解决此问题？

#/bin/bash/

是打字错误吗？谢谢你的提示。是的，这是个问题。改变了之后，我遇到了不同的情况。如果在gpu的顶端，我有#/usr/bin/python3（我在检查了“which python”之后得到了这个结果），然后输出文件抱怨“第13行的语法无效”（python3 test.py）。在另一种情况下，如果我在作业文件中放入“import sys print（sys.version）print（sys.path）”而不是“python3 test.py”，它执行得很好：3.6.8（默认，2019年8月7日，17:28:10）[GCC 4.8.5 20150623（Red Hat 4.8.5-39）['/var/spool/slurm/d/job40233'、'/usr/lib64/python36.zip'、'/usr/lib64/python3.6'、'/usr/lib64/python3.6/lib dynload'、'/home/ieja/.local/lib/python3.6/site-packages'、'/usr/local/lib64/python3.6/site-packages'、'/usr/lib/python3.6/site-packages']你好，世界--->完成了~