Kubernetes 检查我的卡夫卡和Zookeeper功能和连接
作为我系统的一部分,我需要在Kubernetes上使用Kafka和Zookeeper集群。 我使用statefolset部署它们,并使用headless服务让Kafka的代理能够和Zookeeper服务器进行对话 集群似乎正在运行-(通过键入Kubernetes 检查我的卡夫卡和Zookeeper功能和连接,kubernetes,apache-kafka,apache-zookeeper,Kubernetes,Apache Kafka,Apache Zookeeper,作为我系统的一部分,我需要在Kubernetes上使用Kafka和Zookeeper集群。 我使用statefolset部署它们,并使用headless服务让Kafka的代理能够和Zookeeper服务器进行对话 集群似乎正在运行-(通过键入kubectl get pods) 我的问题是,我真的不明白如何检查他们是否能正确沟通 我已经尝试过的- 我试过了- kafkacat -L -b kafka-statefulset-0.kafka headless.default.svc.cluster
kubectl get pods
)
我的问题是,我真的不明白如何检查他们是否能正确沟通
我已经尝试过的-
我试过了-
kafkacat -L -b kafka-statefulset-0.kafka headless.default.svc.cluster.local:9093
并且得到-
% ERROR: Local: Host resolution failure: kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093/0: Failed to resolve 'kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093/1: Failed to resolve 'kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-3.kafka-headless.default.svc.cluster.local:9093/3: Failed to resolve 'kafka-statefulset-3.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093/4: Failed to resolve 'kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093/2: Failed to resolve 'kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093': Name or service not known
%错误:无法获取元数据:本地:代理传输失败
我试过了-
kafkacat -L -b kafka-statefulset-0.kafka headless.default.svc.cluster.local:9093
kafkacat -b 172.17.0.10:9093 -t second_topic -P
并且得到-
% ERROR: Local: Host resolution failure: kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093/0: Failed to resolve 'kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093/1: Failed to resolve 'kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-3.kafka-headless.default.svc.cluster.local:9093/3: Failed to resolve 'kafka-statefulset-3.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093/4: Failed to resolve 'kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093': Name or service not known
% ERROR: Local: Host resolution failure: kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093/2: Failed to resolve 'kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093': Name or service not known
但当我跑的时候-
kafkacat -L -b 172.17.0.10:9093
我得到-
Metadata for all topics (from broker -1: 172.17.0.10:9093/bootstrap):
5 brokers:
broker 2 at kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093
broker 4 at kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093
broker 1 at kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093
broker 3 at kafka-statefulset-3.kafka-headless.default.svc.cluster.local:9093
broker 0 at kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093
4 topics:
topic "second_topic" with 1 partitions:
partition 0, leader 4, replicas: 4, isrs: 4
据我目前了解,我没有正确配置服务,以便从集群外部与它们连接,但我可以从集群内部与它们连接。
虽然我可以在集群内与他们通信,但我不断地遇到我无法理解的错误
例如(将卡夫卡安装在pod的容器中,并尝试与卡夫卡的其他经纪人交谈)——
我得到了正确的元数据,但最后出现了错误-
etadata for all topics (from broker -1: kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093/bootstrap):
5 brokers:
broker 2 at kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093
broker 4 at kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093
broker 1 at kafka-statefulset-1.kafka-headless.default.svc.cluster.local:9093
broker 3 at kafka-statefulset-3.kafka-headless.default.svc.cluster.local:9093
broker 0 at kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093
5 topics:
topic "TOPIC" with 1 partitions:
partition 0, leader 2, replicas: 2, isrs: 2
topic "second_topic" with 1 partitions:
partition 0, leader 4, replicas: 4, isrs: 4
topic "first_topic" with 1 partitions:
partition 0, leader 2, replicas: 2, isrs: 2
topic "nir_topic" with 1 partitions:
partition 0, leader 0, replicas: 0, isrs: 0
topic "first" with 1 partitions:
partition 0, leader 3, replicas: 3, isrs: 3
%3|1581685918.022|FAIL|rdkafka#producer-0| kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093/0: Failed to connect to broker at kafka-statefulset-0.kafka-headless.default.svc.cluster.local:: Interrupted system call
%3|1581685918.022|ERROR|rdkafka#producer-0| kafka-statefulset-0.kafka-headless.default.svc.cluster.local:9093/0: Failed to connect to broker at kafka-statefulset-0.kafka-headless.default.svc.cluster.local:: Interrupted system call
%3|1581685918.022|FAIL|rdkafka#producer-0| kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093/2: Failed to connect to broker at kafka-statefulset-2.kafka-headless.default.svc.cluster.local:: Interrupted system call
%3|1581685918.022|ERROR|rdkafka#producer-0| kafka-statefulset-2.kafka-headless.default.svc.cluster.local:9093/2: Failed to connect to broker at kafka-statefulset-2.kafka-headless.default.svc.cluster.local:: Interrupted system call
%3|1581685918.023|FAIL|rdkafka#producer-0| kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093/4: Failed to connect to broker at kafka-statefulset-4.kafka-headless.default.svc.cluster.local:: Interrupted system call
%3|1581685918.023|ERROR|rdkafka#producer-0| kafka-statefulset-4.kafka-headless.default.svc.cluster.local:9093/4: Failed to connect to broker at kafka-statefulset-4.kafka-headless.default.svc.cluster.local:: Interrupted system call
有什么问题?
我如何检查它们之间的通信,以及卡夫卡和Zookeeper两者的功能是否正确
有关系统的更多信息(如果需要,我可以复制所有YAML配置,但它相当长)-
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kafka-headless ClusterIP None <none> 9093/TCP 5d10h
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5d11h
zookeeper-cs ClusterIP 10.106.99.170 <none> 2181/TCP 5d11h
zookeeper-headless ClusterIP None <none> 2888/TCP,3888/TCP 5d11h
动物园管理员-
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zookeeper-statefulset
spec:
selector:
matchLabels:
app: zookeeper-app
serviceName: zookeeper-headless
replicas: 5
updateStrategy:
type: RollingUpdate
podManagementPolicy: OrderedReady
template:
metadata:
labels:
app: zookeeper-app
spec:
containers:
- name: kubernetes-zookeeper
imagePullPolicy: Always
image: "k8s.gcr.io/kubernetes-zookeeper:1.0-3.4.10"
resources:
requests:
memory: "1Gi"
cpu: "0.5"
ports:
# Expose by zookeeper-cs to the client
- containerPort: 2181
name: client
# Expose by zookeeper-headless to the other replicas of the set
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
command:
- sh
- -c
- "start-zookeeper \
--servers=3 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=60 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=40000 \
--min_session_timeout=4000 \
--log_level=INFO"
readinessProbe:
exec:
command: ["sh", "-c", "zookeeper-ready 2181"]
initialDelaySeconds: 60
timeoutSeconds: 10
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 60
timeoutSeconds: 10
volumeMounts:
- name: zookeeper-volume
mountPath: /var/lib/zookeeper
securityContext:
fsGroup: 1000
volumeClaimTemplates:
- metadata:
name: zookeeper-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
kafkacat-L-b kafka-statefulset-0.kafka headless.default.svc.cluster.local:9093
不起作用。在第一个-b
标志后面有一个空格
kafkacat-b172.17.0.10:9093-t第二主题-p
如果没有正确的播发。侦听器
返回代理的可解析地址,这将不起作用
kafkacat-L-b 172.17.0.10:9093
这是朝着正确方向迈出的一步,但您使用的是Docker网络IP,而不是任何服务的群集/节点端口,仍然无法移植到集群外的其他机器
在pod的容器中安装卡夫卡,并尝试与卡夫卡的其他经纪人交谈 这是一个很好的测试,可以确保复制至少能够工作,但不能解决外部客户端问题
我得到了正确的元数据,但最后出现了错误 这可能是一个迹象,表明你对经纪人或动物园管理员的健康检查失败,pod正在重新启动。请注意,错误周期从0,2,4 我还建立了无头服务,通过端口9093与每个卡夫卡的经纪人交谈 好的,很好,但是您需要将每个代理的
播发的侦听器设置为现在接受9093上的连接
总而言之,我认为仅仅使用现有的卡夫卡头盔图表,允许外部连接将是您的最佳选择 正如您在评论中提到的:
我运行了-/#nslookup headless.default.svc.cluster.local
并获得了-服务器:10.96.0.10地址1:10.96.0.10 kube-dns.kube-system.svc.cluster.local nslookup:无法解析'headless.default.svc.cluster.local
问题与您的环境中的DNS有关,因为您的环境无法解决。
DNS组件应提供Pod的DNS名称
您应该收到类似以下内容:
/ # nslookup my-kafka-headless
Server: 10.122.0.10
Address 1: 10.122.0.10 kube-dns.kube-system.svc.cluster.local
Name: my-kafka-headless
Address 1: 10.56.0.5 my-kafka-0.my-kafka-headless.default.svc.cluster.local
如果您希望根据DNS名称(在您的示例中为headless.default.svc.cluster.local)在Kubernetes中引用Pod支持Statefulset,那么Treat是一种先决条件
验证您的服务是否已设置.sepec.clustrip:None
或kube系统
命名空间中的kube dns XXXX
pod是否一切正常。您可以找到有关DNS问题疑难解答的一些信息
正如@cricket_007所建议的,你可以使用头盔来部署卡夫卡。例如,头盔图表,其中还包含如何操作 我有一个类似的问题,我通过在客户端OShostfile
中添加以下更改来修复它。(在mac中,它位于具有root访问权限的private/etc/host
中。然后您应该sudo dscacheutil-flushcache
)
Kafkacat已经告诉您解析的地址是什么样子的:
kafka-statefulset-0.kafka-headless.default.svc.cluster.local
您还可以添加解析域的任何IP(如果已路由到域)您是如何设置的?您是使用现有的头盔图表还是全部手动操作?您是在集群内使用kafkacat
?你能分享一下你的清单吗?还有,kubectl run-i--tty--image-busybox:1.28 dns-test--restart=Never--rm的输出是什么?如果我运行这个命令,我只会得到一个命令提示符(像bash这样的锁):
@nirkov对不起,我的错,它切断了最后一个命令。运行此命令后,获取/#:
类型nslookup headless.default.svc.cluster.local
,将输出什么?Manifest是我的意思,YAMLs,您的配置运行了-/#nslookup headless.default.svc.cluster.local
和got-服务器:10.96.0.10地址1:10.96.0.10 kube-dns.kube-system.svc.cluster.local nslookup:无法解析“headless.default.svc.cluster.local”
非常感谢您的回答。我阅读了有关播发的的内容。听众
以检查您的建议,但我不知道如何覆盖它(我找不到此docker映像的源代码)。你会笑吗
127.0.0.1 my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc
localhost my-cluster-kafka-0.my-cluster-kafka-brokers.kafka.svc
kafka-statefulset-0.kafka-headless.default.svc.cluster.local