如何从Kubernetes内部的flink docker图像启动jar应用程序?

如何从Kubernetes内部的flink docker图像启动jar应用程序?,docker,kubernetes,apache-flink,Docker,Kubernetes,Apache Flink,我正试图使用我的felipeogutierrez/explore-flink:1.11.1-scala_2.12映像,将其用于kubernetes群集配置,正如它所说的那样。我使用maven编译我的项目,并使用以下Dockerfile扩展默认的flink图像flink:1.11.1-scala_2.12: 然后,本教程介绍如何创建通用群集组件: kubectl create -f k8s/flink-configuration-configmap.yaml kubectl create -f k

我正试图使用我的felipeogutierrez/explore-flink:1.11.1-scala_2.12映像,将其用于kubernetes群集配置,正如它所说的那样。我使用maven编译我的项目,并使用以下Dockerfile扩展默认的flink图像flink:1.11.1-scala_2.12:

然后,本教程介绍如何创建通用群集组件:

kubectl create -f k8s/flink-configuration-configmap.yaml
kubectl create -f k8s/jobmanager-service.yaml
kubectl proxy
kubectl create -f k8s/jobmanager-rest-service.yaml
kubectl get svc flink-jobmanager-rest
然后创建jobmanager-job.yaml:

我在flink jobmanager pod上收到一个CrashLoopBackOff状态错误,日志显示它在flink-dist_2.12-1.11.1.jar:1.11.1 jar文件中找不到类org.sense.flink.examples.stream.tpch.TPCHQuery03。但是,我希望kubernetes也尝试查看/opt/flink/usrlib/explore-flink.jar文件。我正在复制这个jar文件并将其添加到我的图像的Dockerfile中,但它似乎不起作用。我错过了什么?下面是我的jobmanager-job.yaml文件:

和我的完整日志文件:

$ kubectl logs flink-jobmanager-qfkjl
Starting Job Manager
sed: couldn't open temporary file /opt/flink/conf/sedSg30ro: Read-only file system
sed: couldn't open temporary file /opt/flink/conf/sed1YrBco: Read-only file system
/docker-entrypoint.sh: 72: /docker-entrypoint.sh: cannot create /opt/flink/conf/flink-conf.yaml: Permission denied
/docker-entrypoint.sh: 91: /docker-entrypoint.sh: cannot create /opt/flink/conf/flink-conf.yaml.tmp: Read-only file system
Starting standalonejob as a console application on host flink-jobmanager-qfkjl.
2020-09-21 08:08:29,528 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - --------------------------------------------------------------------------------
2020-09-21 08:08:29,531 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Preconfiguration: 
2020-09-21 08:08:29,532 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - 


JM_RESOURCE_PARAMS extraction logs:
jvm_params: -Xmx1073741824 -Xms1073741824 -XX:MaxMetaspaceSize=268435456
logs: INFO  [] - Loading configuration property: jobmanager.rpc.address, flink-jobmanager
INFO  [] - Loading configuration property: taskmanager.numberOfTaskSlots, 4
INFO  [] - Loading configuration property: blob.server.port, 6124
INFO  [] - Loading configuration property: jobmanager.rpc.port, 6123
INFO  [] - Loading configuration property: taskmanager.rpc.port, 6122
INFO  [] - Loading configuration property: queryable-state.proxy.ports, 6125
INFO  [] - Loading configuration property: jobmanager.memory.process.size, 1600m
INFO  [] - Loading configuration property: taskmanager.memory.process.size, 1728m
INFO  [] - Loading configuration property: parallelism.default, 2
INFO  [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
INFO  [] - Final Master Memory configuration:
INFO  [] -   Total Process Memory: 1.563gb (1677721600 bytes)
INFO  [] -     Total Flink Memory: 1.125gb (1207959552 bytes)
INFO  [] -       JVM Heap:         1024.000mb (1073741824 bytes)
INFO  [] -       Off-heap:         128.000mb (134217728 bytes)
INFO  [] -     JVM Metaspace:      256.000mb (268435456 bytes)
INFO  [] -     JVM Overhead:       192.000mb (201326592 bytes)

2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - --------------------------------------------------------------------------------
2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Starting StandaloneApplicationClusterEntryPoint (Version: 1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  OS current user: flink
2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Current Hadoop/Kerberos user: <no hadoop dependency found>
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.265-b01
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Maximum heap size: 989 MiBytes
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  JAVA_HOME: /usr/local/openjdk-8
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  No Hadoop Dependency available
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  JVM Options:
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Xmx1073741824
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Xms1073741824
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -XX:MaxMetaspaceSize=268435456
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlog.file=/opt/flink/log/flink--standalonejob-0-flink-jobmanager-qfkjl.log
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Program Arguments:
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     --configDir
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     /opt/flink/conf
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     --job-classname
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     org.sense.flink.examples.stream.tpch.TPCHQuery03
2020-09-21 08:08:29,537 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Classpath: /opt/flink/lib/flink-csv-1.11.1.jar:/opt/flink/lib/flink-json-1.11.1.jar:/opt/flink/lib/flink-shaded-zookeeper-3.4.14.jar:/opt/flink/lib/flink-table-blink_2.12-1.11.1.jar:/opt/flink/lib/flink-table_2.12-1.11.1.jar:/opt/flink/lib/log4j-1.2-api-2.12.1.jar:/opt/flink/lib/log4j-api-2.12.1.jar:/opt/flink/lib/log4j-core-2.12.1.jar:/opt/flink/lib/log4j-slf4j-impl-2.12.1.jar:/opt/flink/lib/flink-dist_2.12-1.11.1.jar:::
2020-09-21 08:08:29,538 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - --------------------------------------------------------------------------------
2020-09-21 08:08:29,540 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Registered UNIX signal handlers for [TERM, HUP, INT]
2020-09-21 08:08:29,577 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not create application program.
org.apache.flink.util.FlinkException: Could not find the provided job class (org.sense.flink.examples.stream.tpch.TPCHQuery03) in the user lib directory (/opt/flink/usrlib).
    at org.apache.flink.client.deployment.application.ClassPathPackagedProgramRetriever.getJobClassNameOrScanClassPath(ClassPathPackagedProgramRetriever.java:140) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.client.deployment.application.ClassPathPackagedProgramRetriever.getPackagedProgram(ClassPathPackagedProgramRetriever.java:123) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:110) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:78) [flink-dist_2.12-1.11.1.jar:1.11.1]

我的配置有两个问题。首先,Dockerfile没有将explore-flink.jar复制到正确的位置。其次,我不需要在Kubernetes文件jobmanager-job.yaml上装载卷作业工件卷。这是我的Dockerfile:

以及jobmanager-job.yaml文件:


我的配置有两个问题。首先,Dockerfile没有将explore-flink.jar复制到正确的位置。其次,我不需要在Kubernetes文件jobmanager-job.yaml上装载卷作业工件卷。这是我的Dockerfile:

以及jobmanager-job.yaml文件:


您是否尝试过不以flink用户的身份运行pod,或者将-chown=flink:flink添加到您的COPY命令中?我尝试过使用COPY-from=builder-chown=flink:flink/opt/explore-flink/target/explore-flink.jar/opt/flink/usrlib/explore-flink.jar,无论是否使用用户flink,我都会遇到同样的错误=噢,您正在将一个文件夹从主机装载到/opt/flink/usrlib/,这可能就是您找不到放入该文件夹的jar的原因。尝试在Dockerfile中使用/opt/flink/lib或/opt/flink/plugin,如中所述。顺便说一句,jobmanager-job.yaml的最后一行仍然是通用的/host/path/to/job/artifacts,或者将其指向正确的artifact文件夹,或者删除卷和volumemountyes,就是这样。它就像一个符咒。Thank=您是否尝试过不以flink用户的身份运行pod,或者将-chown=flink:flink添加到您的COPY命令中?我尝试过使用COPY-from=builder-chown=flink:flink/opt/explore-flink/target/explore-flink.jar/opt/flink/usrlib/explore-flink.jar,无论是否使用用户flink,我都会遇到同样的错误=噢,您正在将一个文件夹从主机装载到/opt/flink/usrlib/,这可能就是您找不到放入该文件夹的jar的原因。尝试在Dockerfile中使用/opt/flink/lib或/opt/flink/plugin,如中所述。顺便说一句,jobmanager-job.yaml的最后一行仍然是通用的/host/path/to/job/artifacts,或者将其指向正确的artifact文件夹,或者删除卷和volumemountyes,就是这样。它就像一个符咒。谢谢=
kubectl create -f k8s/jobmanager-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: flink-jobmanager
spec:
  template:
    metadata:
      labels:
        app: flink
        component: jobmanager
    spec:
      restartPolicy: OnFailure
      containers:
        - name: jobmanager
          image: felipeogutierrez/explore-flink:1.11.1-scala_2.12
          imagePullPolicy: Always
          env:
          args: ["standalone-job", "--job-classname", "org.sense.flink.examples.stream.tpch.TPCHQuery03"]
          ports:
            - containerPort: 6123
              name: rpc
            - containerPort: 6124
              name: blob-server
            - containerPort: 8081
              name: webui
          livenessProbe:
            tcpSocket:
              port: 6123
            initialDelaySeconds: 30
            periodSeconds: 60
          volumeMounts:
            - name: flink-config-volume
              mountPath: /opt/flink/conf
            - name: job-artifacts-volume
              mountPath: /opt/flink/usrlib
          securityContext:
            runAsUser: 9999  # refers to user _flink_ from official flink image, change if necessary
      volumes:
        - name: flink-config-volume
          configMap:
            name: flink-config
            items:
              - key: flink-conf.yaml
                path: flink-conf.yaml
              - key: log4j-console.properties
                path: log4j-console.properties
        - name: job-artifacts-volume
          hostPath:
            path: /host/path/to/job/artifacts
$ kubectl logs flink-jobmanager-qfkjl
Starting Job Manager
sed: couldn't open temporary file /opt/flink/conf/sedSg30ro: Read-only file system
sed: couldn't open temporary file /opt/flink/conf/sed1YrBco: Read-only file system
/docker-entrypoint.sh: 72: /docker-entrypoint.sh: cannot create /opt/flink/conf/flink-conf.yaml: Permission denied
/docker-entrypoint.sh: 91: /docker-entrypoint.sh: cannot create /opt/flink/conf/flink-conf.yaml.tmp: Read-only file system
Starting standalonejob as a console application on host flink-jobmanager-qfkjl.
2020-09-21 08:08:29,528 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - --------------------------------------------------------------------------------
2020-09-21 08:08:29,531 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Preconfiguration: 
2020-09-21 08:08:29,532 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - 


JM_RESOURCE_PARAMS extraction logs:
jvm_params: -Xmx1073741824 -Xms1073741824 -XX:MaxMetaspaceSize=268435456
logs: INFO  [] - Loading configuration property: jobmanager.rpc.address, flink-jobmanager
INFO  [] - Loading configuration property: taskmanager.numberOfTaskSlots, 4
INFO  [] - Loading configuration property: blob.server.port, 6124
INFO  [] - Loading configuration property: jobmanager.rpc.port, 6123
INFO  [] - Loading configuration property: taskmanager.rpc.port, 6122
INFO  [] - Loading configuration property: queryable-state.proxy.ports, 6125
INFO  [] - Loading configuration property: jobmanager.memory.process.size, 1600m
INFO  [] - Loading configuration property: taskmanager.memory.process.size, 1728m
INFO  [] - Loading configuration property: parallelism.default, 2
INFO  [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
INFO  [] - Final Master Memory configuration:
INFO  [] -   Total Process Memory: 1.563gb (1677721600 bytes)
INFO  [] -     Total Flink Memory: 1.125gb (1207959552 bytes)
INFO  [] -       JVM Heap:         1024.000mb (1073741824 bytes)
INFO  [] -       Off-heap:         128.000mb (134217728 bytes)
INFO  [] -     JVM Metaspace:      256.000mb (268435456 bytes)
INFO  [] -     JVM Overhead:       192.000mb (201326592 bytes)

2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - --------------------------------------------------------------------------------
2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Starting StandaloneApplicationClusterEntryPoint (Version: 1.11.1, Scala: 2.12, Rev:7eb514a, Date:2020-07-15T07:02:09+02:00)
2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  OS current user: flink
2020-09-21 08:08:29,533 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Current Hadoop/Kerberos user: <no hadoop dependency found>
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.265-b01
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Maximum heap size: 989 MiBytes
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  JAVA_HOME: /usr/local/openjdk-8
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  No Hadoop Dependency available
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  JVM Options:
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Xmx1073741824
2020-09-21 08:08:29,534 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Xms1073741824
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -XX:MaxMetaspaceSize=268435456
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlog.file=/opt/flink/log/flink--standalonejob-0-flink-jobmanager-qfkjl.log
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
2020-09-21 08:08:29,535 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Program Arguments:
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     --configDir
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     /opt/flink/conf
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     --job-classname
2020-09-21 08:08:29,536 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -     org.sense.flink.examples.stream.tpch.TPCHQuery03
2020-09-21 08:08:29,537 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] -  Classpath: /opt/flink/lib/flink-csv-1.11.1.jar:/opt/flink/lib/flink-json-1.11.1.jar:/opt/flink/lib/flink-shaded-zookeeper-3.4.14.jar:/opt/flink/lib/flink-table-blink_2.12-1.11.1.jar:/opt/flink/lib/flink-table_2.12-1.11.1.jar:/opt/flink/lib/log4j-1.2-api-2.12.1.jar:/opt/flink/lib/log4j-api-2.12.1.jar:/opt/flink/lib/log4j-core-2.12.1.jar:/opt/flink/lib/log4j-slf4j-impl-2.12.1.jar:/opt/flink/lib/flink-dist_2.12-1.11.1.jar:::
2020-09-21 08:08:29,538 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - --------------------------------------------------------------------------------
2020-09-21 08:08:29,540 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Registered UNIX signal handlers for [TERM, HUP, INT]
2020-09-21 08:08:29,577 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not create application program.
org.apache.flink.util.FlinkException: Could not find the provided job class (org.sense.flink.examples.stream.tpch.TPCHQuery03) in the user lib directory (/opt/flink/usrlib).
    at org.apache.flink.client.deployment.application.ClassPathPackagedProgramRetriever.getJobClassNameOrScanClassPath(ClassPathPackagedProgramRetriever.java:140) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.client.deployment.application.ClassPathPackagedProgramRetriever.getPackagedProgram(ClassPathPackagedProgramRetriever.java:123) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:110) ~[flink-dist_2.12-1.11.1.jar:1.11.1]
    at org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:78) [flink-dist_2.12-1.11.1.jar:1.11.1]
FROM maven:3.6-jdk-8-slim AS builder
# get explore-flink job and compile it
COPY ./java/explore-flink /opt/explore-flink
WORKDIR /opt/explore-flink
RUN mvn clean install

FROM flink:1.11.1-scala_2.12
WORKDIR /opt/flink/lib
COPY --from=builder --chown=flink:flink /opt/explore-flink/target/explore-flink.jar /opt/flink/lib/explore-flink.jar
apiVersion: batch/v1
kind: Job
metadata:
  name: flink-jobmanager
spec:
  template:
    metadata:
      labels:
        app: flink
        component: jobmanager
    spec:
      restartPolicy: OnFailure
      containers:
        - name: jobmanager
          image: felipeogutierrez/explore-flink:1.11.1-scala_2.12
          imagePullPolicy: Always
          env:
          #command: ["ls"]
          args: ["standalone-job", "--job-classname", "org.sense.flink.App", "-app", "36"] #, <optional arguments>, <job arguments>] # optional arguments: ["--job-id", "<job id>", "--fromSavepoint", "/path/to/savepoint", "--allowNonRestoredState"]
          #args: ["standalone-job", "--job-classname", "org.sense.flink.examples.stream.tpch.TPCHQuery03"] #, <optional arguments>, <job arguments>] # optional arguments: ["--job-id", "<job id>", "--fromSavepoint", "/path/to/savepoint", "--allowNonRestoredState"]
          ports:
            - containerPort: 6123
              name: rpc
            - containerPort: 6124
              name: blob-server
            - containerPort: 8081
              name: webui
          livenessProbe:
            tcpSocket:
              port: 6123
            initialDelaySeconds: 30
            periodSeconds: 60
          volumeMounts:
            - name: flink-config-volume
              mountPath: /opt/flink/conf
          securityContext:
            runAsUser: 9999  # refers to user _flink_ from official flink image, change if necessary
      volumes:
        - name: flink-config-volume
          configMap:
            name: flink-config
            items:
              - key: flink-conf.yaml
                path: flink-conf.yaml
              - key: log4j-console.properties
                path: log4j-console.properties