Apache spark 在Kubernetes上运行Spark时工人出错
我正在Kubernetes上以客户端模式运行Spark 2.4.1 我正试图从一个包含spark的吊舱提交一个任务,该吊舱将发射2个executor吊舱。 命令如下所示:Apache spark 在Kubernetes上运行Spark时工人出错,apache-spark,kubernetes,pyspark,kubernetes-pod,Apache Spark,Kubernetes,Pyspark,Kubernetes Pod,我正在Kubernetes上以客户端模式运行Spark 2.4.1 我正试图从一个包含spark的吊舱提交一个任务,该吊舱将发射2个executor吊舱。 命令如下所示: bin/spark-shell \ --master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \ --deploy-mode client \ --conf spark.executor.instances=2 \ --conf spark
bin/spark-shell \
--master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
--deploy-mode client \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.container.image=$SPARK_IMAGE \
--conf spark.kubernetes.driver.pod.name=$HOSTNAME
--conf spark.kubernetes.executor.podNamePrefix=spark-exec \
--conf spark.ui.port=4040
这些executor Pod已创建,但由于以下错误而不断失败:
Caused by: java.io.IOException: Failed to connect to spark-57b8f99554-7nd45:4444
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: spark-57b8f99554-7nd45
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
工人吊舱似乎无法到达主节点(吊舱spark-57b8f99554-7nd45),这应该与点有关,但我不知道如何解决它。
有什么想法吗?要在Kubernetes吊舱上以客户端模式运行Spark,您需要遵循以下步骤:
apiVersion: v1
kind: Service
metadata:
name: yoursparkapp
spec:
clusterIP: "None"
selector:
spark-app-selector: yoursparkapp
ports:
- name: driver-rpc-port
protocol: TCP
port: 7078
targetPort: 7078
- name: blockmanager
protocol: TCP
port: 7079
targetPort: 7079
spark应用程序选择器:yoursparkapp
,因为它必须与运行spark submit的pod所用的标签相匹配
使用以下命令在集群中安装上述服务:kubectl create-f yoursparkappservice.yml-n your_namespace
kubectl run \
-n your_namespace -i --tty yoursparkapp \
--restart=Never \
--overrides='{ "apiVersion" : "v1", "metadata" : { "annotations" : "labels": { "spark-app-selector" : "yoursparkapp" } } }' \
--image=your_container:latest -- /bin/bash
“spark应用程序选择器”:“yoursparkapp”
。这样,该pod将使用第一步中创建的服务
spark-submit --master k8s://https://kubernetes_url:443 \
--deploy-mode client \
--name yoursparkapp \
--conf spark.kubernetes.container.image=your_container:latest \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
--conf spark.kubernetes.namespace=your_namespace \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.driver.memory=2g \
--conf spark.executor.memory=2g \
--conf spark.submit.deployMode=client \
--conf spark.executor.cores=3 \
--conf spark.driver.cores=3 \
--conf spark.driver.host=yoursparkapp \
--conf spark.driver.port=7078 \
--conf spark.kubernetes.driver.pod.name=yoursparkapp \
/path/to/your/remote_spark_app.py
--conf spark.driver.bindAddress=0.0.0.0
--conf spark.driver.host=$MY_NODE_NAME
--conf spark.driver.port=3xxx1
--conf spark.driver.blockManager.port=3xxx2
为了从Kubernetes吊舱提交任务,您应该在spark submit中定义以下参数:
spark-submit --master k8s://https://kubernetes_url:443 \
--deploy-mode client \
--name yoursparkapp \
--conf spark.kubernetes.container.image=your_container:latest \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
--conf spark.kubernetes.namespace=your_namespace \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.driver.memory=2g \
--conf spark.executor.memory=2g \
--conf spark.submit.deployMode=client \
--conf spark.executor.cores=3 \
--conf spark.driver.cores=3 \
--conf spark.driver.host=yoursparkapp \
--conf spark.driver.port=7078 \
--conf spark.kubernetes.driver.pod.name=yoursparkapp \
/path/to/your/remote_spark_app.py
--conf spark.driver.bindAddress=0.0.0.0
--conf spark.driver.host=$MY_NODE_NAME
--conf spark.driver.port=3xxx1
--conf spark.driver.blockManager.port=3xxx2
您可以在服务和部署配置中定义端口和节点名称,如下所示(我不提供常规部分):
您确定第一个命令正常吗?根据,不应该是
mode=cluster
?他正在尝试运行shell,因此一切正常:`cluster deploy mode不适用于Spark shell。¨@EgorStambakio,因为Kubernetes上的Spark 2.4.0客户端模式是可能的。