Apache zookeeper 无法使用Zookeeper启动Flink HA群集

Apache zookeeper 无法使用Zookeeper启动Flink HA群集,apache-zookeeper,apache-flink,flink-streaming,Apache Zookeeper,Apache Flink,Flink Streaming,我正在尝试安装Flink HA群集(Zookeeper模式),但任务管理器找不到作业管理器 这里我给你介绍一下建筑 - Machine 1 : Job Manager + Zookeeper - Machine 2 : Task Manager 大师: Machine1 奴隶: Machine2 flink-conf.yaml: #jobmanager.rpc.address: localhost jobmanager.rpc.port: 6123 blob.server.port: 50

我正在尝试安装Flink HA群集(Zookeeper模式),但任务管理器找不到作业管理器

这里我给你介绍一下建筑

- Machine 1 : Job Manager + Zookeeper
- Machine 2 : Task Manager
大师:

Machine1
奴隶:

Machine2
flink-conf.yaml:

#jobmanager.rpc.address: localhost
jobmanager.rpc.port: 6123
blob.server.port: 50100-50200
taskmanager.data.port: 6121
high-availability: zookeeper
high-availability.zookeeper.quorum: Machine1:2181
high-availability.zookeeper.path.root: /flink-1.5.1
high-availability.cluster-id: /default_b
high-availability.storageDir: file:///shareflink/recovery
这是Task Manager的日志,它尝试连接到localhost而不是Machine1:

2018-08-17 10:46:44,875 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils            - Trying to select the network interface and address to use by connecting to the leading JobManager.
2018-08-17 10:46:44,876 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils            - TaskManager will try to connect for 10000 milliseconds before falling back to heuristics
2018-08-17 10:46:44,966 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Retrieved new target address /127.0.0.1:37133.
2018-08-17 10:46:45,324 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Trying to connect to address /127.0.0.1:37133
2018-08-17 10:46:45,325 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address 'Machine2/IP-Machine2': Connection refused
2018-08-17 10:46:45,325 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address '/127.0.0.1': Connection refused
2018-08-17 10:46:45,325 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address '/IP_Machine2': Connection refused
2018-08-17 10:46:45,325 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address '/127.0.0.1': Connection refused
2018-08-17 10:46:45,326 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address '/IP_Machine2': Connection refused
2018-08-17 10:46:45,326 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address '/127.0.0.1': Connection refused
2018-08-17 10:46:45,726 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Trying to connect to address /127.0.0.1:37133
2018-08-17 10:46:45,727 INFO  org.apache.flink.runtime.net.ConnectionUtils                  - Failed to connect from address 'Machine2/IP-Machine2

2018-08-17 10:47:22,022 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@127.0.0.1:36515] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@127.0.0.1:36515]] Caused by: [Connection refused: /127.0.0.1:36515]

2018-08-17 10:47:22,022 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@127.0.0.1:36515/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@127.0.0.1:36515/user/resourcemanager..
2018-08-17 10:47:32,037 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /127.0.0.1:36515
PS.:/etc/hosts包含本地主机、机器1和机器2

您能告诉我任务经理如何连接到作业经理吗


这就是我们对TaskManager的看法。它是否作为不带HA的集群工作

root@flink-taskmanager-deployment-nonprod-597f858cb-4nmbr:/opt/flink# cat conf/masters 
flink-jobmanager-nonprod.rpds.svc.cluster.local:8081

root@flink-taskmanager-deployment-nonprod-597f858cb-4nmbr:/opt/flink# cat conf/slaves 
localhost

root@flink-taskmanager-deployment-nonprod-597f858cb-4nmbr:/opt/flink# cat conf/flink-conf.yaml 

jobmanager.rpc.address: flink-jobmanager-nonprod.rpds.svc.cluster.local
...
...

填写jobmanager.rpc.address时,它将不是HA群集,对吗?