Apache spark 如何检查kubernetes上spark中的错误原因?

Apache spark 如何检查kubernetes上spark中的错误原因?,apache-spark,kubernetes,Apache Spark,Kubernetes,我运行下面的命令在kubernetes上运行spark作业 ./bin/spark-submit \ --master k8s://https://192.168.0.91:6443 \ --deploy-mode cluster \ --name spark-steve-test \ --class org.apache.spark.examples.Spark \ --conf spark.executor.in

我运行下面的命令在kubernetes上运行spark作业

./bin/spark-submit \
        --master k8s://https://192.168.0.91:6443 \
        --deploy-mode cluster \
        --name spark-steve-test \
        --class org.apache.spark.examples.Spark \
        --conf spark.executor.instances=2 \
        --conf spark.kubernetes.namespace=spark \
        --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
        --conf spark.kubernetes.container.image=sclee01/spark:v2.3.0 \
        local:///opt/spark/examples/jars/spark-examples_2.12-3.0.1.jar
然而,我得到了下面的消息,由于某些原因,pod似乎没有被创建

20/10/22 12:00:36 INFO LoggingPodStatusWatcherImpl: Application status for spark-6a79e5b39a84403bb83dbf69ca20a02c (phase: Pending)
20/10/22 12:00:37 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
         pod name: spark-steve-test-01-734603754e4038ed-driver
         namespace: spark
         labels: spark-app-selector -> spark-6a79e5b39a84403bb83dbf69ca20a02c, spark-role -> driver
         pod uid: 5afec1f7-b1cd-4bac-a6c2-be239e0efc30
         creation time: 2020-10-22T03:00:34Z
         service account name: spark
         volumes: spark-local-dir-1, spark-conf-volume, spark-token-bdwh9
         node name: bistelresearchdev-sm
         start time: 2020-10-22T03:00:34Z
         phase: Running
         container status: 
                 container name: spark-kubernetes-driver
                 container image: sclee01/spark:v2.3.0
                 container state: running
                 container started at: 2020-10-22T03:00:37Z
20/10/22 12:00:37 INFO LoggingPodStatusWatcherImpl: Application status for spark-6a79e5b39a84403bb83dbf69ca20a02c (phase: Running)
20/10/22 12:00:38 INFO LoggingPodStatusWatcherImpl: Application status for spark-6a79e5b39a84403bb83dbf69ca20a02c (phase: Running)
20/10/22 12:00:39 INFO LoggingPodStatusWatcherImpl: Application status for spark-6a79e5b39a84403bb83dbf69ca20a02c (phase: Running)
20/10/22 12:00:40 INFO LoggingPodStatusWatcherImpl: Application status for spark-6a79e5b39a84403bb83dbf69ca20a02c (phase: Running)
20/10/22 12:00:41 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
         pod name: spark-steve-test-01-734603754e4038ed-driver
         namespace: spark
         labels: spark-app-selector -> spark-6a79e5b39a84403bb83dbf69ca20a02c, spark-role -> driver
         pod uid: 5afec1f7-b1cd-4bac-a6c2-be239e0efc30
         creation time: 2020-10-22T03:00:34Z
         service account name: spark
         volumes: spark-local-dir-1, spark-conf-volume, spark-token-bdwh9
         node name: bistelresearchdev-sm
         start time: 2020-10-22T03:00:34Z
         phase: Failed
         container status: 
                 container name: spark-kubernetes-driver
                 container image: sclee01/spark:v2.3.0
                 container state: terminated
                 container started at: 2020-10-22T03:00:37Z
                 container finished at: 2020-10-22T03:00:40Z
                 exit code: 1
                 termination reason: Error
20/10/22 12:00:41 INFO LoggingPodStatusWatcherImpl: Application status for spark-6a79e5b39a84403bb83dbf69ca20a02c (phase: Failed)
20/10/22 12:00:41 INFO LoggingPodStatusWatcherImpl: Container final statuses:


         container name: spark-kubernetes-driver
         container image: sclee01/spark:v2.3.0
         container state: terminated
         container started at: 2020-10-22T03:00:37Z
         container finished at: 2020-10-22T03:00:40Z
         exit code: 1
         termination reason: Error
20/10/22 12:00:41 INFO LoggingPodStatusWatcherImpl: Application spark-steve-test-01 with submission ID spark:spark-steve-test-01-734603754e4038ed-driver finished
20/10/22 12:00:41 INFO ShutdownHookManager: Shutdown hook called
20/10/22 12:00:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-2e4e5f9a-c54d-4790-b4cb-f9b6cd1e2105
我唯一能看到的是
错误
,我看不出详细的原因。 我运行了下面的命令,但它没有给我任何进一步的信息

bistel@BISTelResearchDev-NN:~/user/sclee/project/spark/spark-3.0.1-bin-hadoop2.7$  kubectl logs -p spark-steve-test-01-734603754e4038ed-driver
Error from server (NotFound): pods "spark-steve-test-01-734603754e4038ed-driver" not found
bistel@BISTelResearchDev-NN:~/user/sclee/project/spark/spark-3.0.1-bin-hadoop2.7$ 
任何帮助都将被请求

谢谢。

有关更多信息,请使用
kubectl descripe pod
。 它将打印所选资源的详细说明,包括事件或控制器等相关资源

您还可以使用
kubectl get event | grep pod/
-它将仅显示选定pod的事件