Python 涉及pyspark的CI/CD测试-未设置JAVA_HOME
我正在从事一个使用pyspark的项目,并希望设置自动测试 下面是我的Python 涉及pyspark的CI/CD测试-未设置JAVA_HOME,python,docker,apache-spark,pyspark,continuous-integration,Python,Docker,Apache Spark,Pyspark,Continuous Integration,我正在从事一个使用pyspark的项目,并希望设置自动测试 下面是我的.gitlab ci.yml文件的外观: image: "myimage:latest" stages: - Tests pytest: stage: Tests script: - pytest tests/. 我使用docker文件构建docker imagemyimage,如下所示(请参阅): 但是,当我运行此操作时,gitlab CI作业会出现以下错误: /usr/local/lib/python
.gitlab ci.yml
文件的外观:
image: "myimage:latest"
stages:
- Tests
pytest:
stage: Tests
script:
- pytest tests/.
我使用docker文件构建docker imagemyimage
,如下所示(请参阅):
但是,当我运行此操作时,gitlab CI作业会出现以下错误:
/usr/local/lib/python3.7/site-packages/pyspark/java_gateway.py:95: in launch_gateway
raise Exception("Java gateway process exited before sending the driver its port number")
E Exception: Java gateway process exited before sending the driver its port number
------------------------------- Captured stderr --------------------------------
JAVA_HOME is not set
我知道pyspark要求我的计算机上安装JAVA8或更高版本。我已经在本地设置好了,但是…在CI过程中呢?如何安装Java使其正常工作
我试着添加
RUN sudo add-apt-repository ppa:webupd8team/java
RUN sudo apt-get update
RUN apt-get install oracle-java8-installer
到创建映像但收到错误的Dockerfile
/bin/sh: 1: sudo: not found
如何修改Dockerfile,以便使用pyspark进行测试?在.bash\u配置文件中写入:
export JAVA_HOME=(jdk中的主目录,即/Library/JAVA/JavaVirtualMachines/[yourjdk]/Contents/HOME)适合我的解决方案:添加
RUN apt-get update
RUN apt-get install default-jdk -y
以前
RUN pip install -r requirements.txt
然后一切都如预期般工作,无需进一步修改
编辑
为了实现这一点,我必须将基本映像更新为python:3.7-stretch从命令中删除
sudo
RUN pip install -r requirements.txt