Kubernetes kube计划程序活动性探测失败:获取http://127.0.0.1:10251/healthz: 拨打tcp 127.0.0.1:10251:连接:连接被拒绝
所以我让这个不健康的集群在数据中心部分工作。这可能是我第十次根据以下说明进行重建: 我可以将一些吊舱应用到这个集群上,它似乎可以工作,但最终它开始减速并崩溃,如下面所示。以下是计划程序清单:Kubernetes kube计划程序活动性探测失败:获取http://127.0.0.1:10251/healthz: 拨打tcp 127.0.0.1:10251:连接:连接被拒绝,kubernetes,kubeadm,Kubernetes,Kubeadm,所以我让这个不健康的集群在数据中心部分工作。这可能是我第十次根据以下说明进行重建: 我可以将一些吊舱应用到这个集群上,它似乎可以工作,但最终它开始减速并崩溃,如下面所示。以下是计划程序清单: apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler name
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
image: k8s.gcr.io/kube-scheduler:v1.14.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10251
scheme: HTTP
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
$kubectl-n kube系统获得吊舱
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-42psn 1/1 Running 9 88m
coredns-fb8b8dccf-x9mlt 1/1 Running 11 88m
docker-registry-dqvzb 1/1 Running 1 2d6h
kube-apiserver-kube-apiserver-1 1/1 Running 44 2d8h
kube-apiserver-kube-apiserver-2 1/1 Running 34 2d7h
kube-controller-manager-kube-apiserver-1 1/1 Running 198 2d2h
kube-controller-manager-kube-apiserver-2 0/1 CrashLoopBackOff 170 2d7h
kube-flannel-ds-amd64-4mbfk 1/1 Running 1 2d7h
kube-flannel-ds-amd64-55hc7 1/1 Running 1 2d8h
kube-flannel-ds-amd64-fvwmf 1/1 Running 1 2d7h
kube-flannel-ds-amd64-ht5wm 1/1 Running 3 2d7h
kube-flannel-ds-amd64-rjt9l 1/1 Running 4 2d8h
kube-flannel-ds-amd64-wpmkj 1/1 Running 1 2d7h
kube-proxy-2n64d 1/1 Running 3 2d7h
kube-proxy-2pq2g 1/1 Running 1 2d7h
kube-proxy-5fbms 1/1 Running 2 2d8h
kube-proxy-g8gmn 1/1 Running 1 2d7h
kube-proxy-wrdrj 1/1 Running 1 2d8h
kube-proxy-wz6gv 1/1 Running 1 2d7h
kube-scheduler-kube-apiserver-1 0/1 CrashLoopBackOff 198 2d2h
kube-scheduler-kube-apiserver-2 1/1 Running 5 18m
nginx-ingress-controller-dz8fm 1/1 Running 3 2d4h
nginx-ingress-controller-sdsgg 1/1 Running 3 2d4h
nginx-ingress-controller-sfrgb 1/1 Running 1 2d4h
$kubectl-n kube系统描述pod kube-scheduler-kube-apiserver-1
Containers:
kube-scheduler:
Container ID: docker://c04f3c9061cafef8749b2018cd66e6865d102f67c4d13bdd250d0b4656d5f220
Image: k8s.gcr.io/kube-scheduler:v1.14.2
Image ID: docker-pullable://k8s.gcr.io/kube-scheduler@sha256:052e0322b8a2b22819ab0385089f202555c4099493d1bd33205a34753494d2c2
Port: <none>
Host Port: <none>
Command:
kube-scheduler
--bind-address=127.0.0.1
--kubeconfig=/etc/kubernetes/scheduler.conf
--authentication-kubeconfig=/etc/kubernetes/scheduler.conf
--authorization-kubeconfig=/etc/kubernetes/scheduler.conf
--leader-elect=true
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 28 May 2019 23:16:50 -0400
Finished: Tue, 28 May 2019 23:19:56 -0400
Ready: False
Restart Count: 195
Requests:
cpu: 100m
Liveness: http-get http://127.0.0.1:10251/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/kubernetes/scheduler.conf from kubeconfig (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubeconfig:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/scheduler.conf
HostPathType: FileOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 4h56m (x104 over 37h) kubelet, kube-apiserver-1 Created container kube-scheduler
Normal Started 4h56m (x104 over 37h) kubelet, kube-apiserver-1 Started container kube-scheduler
Warning Unhealthy 137m (x71 over 34h) kubelet, kube-apiserver-1 Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
Normal Pulled 132m (x129 over 37h) kubelet, kube-apiserver-1 Container image "k8s.gcr.io/kube-scheduler:v1.14.2" already present on machine
Warning BackOff 128m (x1129 over 34h) kubelet, kube-apiserver-1 Back-off restarting failed container
Normal SandboxChanged 80m kubelet, kube-apiserver-1 Pod sandbox changed, it will be killed and re-created.
Warning Failed 76m kubelet, kube-apiserver-1 Error: context deadline exceeded
Normal Pulled 36m (x7 over 78m) kubelet, kube-apiserver-1 Container image "k8s.gcr.io/kube-scheduler:v1.14.2" already present on machine
Normal Started 36m (x6 over 74m) kubelet, kube-apiserver-1 Started container kube-scheduler
Normal Created 32m (x7 over 74m) kubelet, kube-apiserver-1 Created container kube-scheduler
Warning Unhealthy 20m (x9 over 40m) kubelet, kube-apiserver-1 Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
Warning BackOff 2m56s (x85 over 69m) kubelet, kube-apiserver-1 Back-off restarting failed container
容器:
kube调度程序:
容器ID:docker://c04f3c9061cafef8749b2018cd66e6865d102f67c4d13bdd250d0b4656d5f220
图片:k8s.gcr.io/kube调度程序:v1.14.2
图像ID:docker-pullable://k8s.gcr.io/kube-scheduler@sha256:052E0322B8A2B22819AB0385089F20255C4099493D1BD33205A34753494D2C2
端口:
主机端口:
命令:
kube调度程序
--绑定地址=127.0.0.1
--kubeconfig=/etc/kubernetes/scheduler.conf
--身份验证kubeconfig=/etc/kubernetes/scheduler.conf
--授权kubeconfig=/etc/kubernetes/scheduler.conf
--当选领导人=正确
国家:等待
原因:仓促退却
最后状态:终止
原因:错误
退出代码:1
开始时间:2019年5月28日星期二23:16:50-0400
完成时间:2019年5月28日星期二23:19:56-0400
就绪:错误
重新启动计数:195
请求:
中央处理器:100米
活跃度:http gethttp://127.0.0.1:10251/healthz 延迟=15s超时=15s周期=10s成功=1失败=8
环境:
挂载:
/来自kubeconfig(ro)的etc/kubernetes/scheduler.conf
条件:
类型状态
初始化为True
准备错误
集装箱准备好了吗
播客预定为真
卷数:
库贝科菲格:
类型:主机路径(裸主机目录卷)
路径:/etc/kubernetes/scheduler.conf
主机路径类型:文件或创建
QoS等级:Burstable
节点选择器:
容忍:不执行
活动:
从消息中键入原因年龄
---- ------ ---- ---- -------
正常创建4h56m(37小时x104)kubelet,kube-apiserver-1创建容器kube调度器
正常启动4h56m(37小时x104)kubelet,kube-apiserver-1启动容器kube调度器
警告137m(x71超过34小时)kubelet,kube-apiserver-1活动性探测失败:获取http://127.0.0.1:10251/healthz: 拨打tcp 127.0.0.1:10251:连接:连接被拒绝
正常拉动132m(x129超过37小时)kubelet,kube-apiserver-1容器映像“k8s.gcr.io/kube调度程序:v1.14.2”已存在于计算机上
警告后退128m(x1129超过34小时)kubelet,kube-apiserver-1后退重新启动失败的容器
普通沙箱更改80m kubelet,kube-apiserver-1 Pod沙箱更改,它将被杀死并重新创建。
警告失败76m kubelet,kube-apiserver-1错误:超过上下文截止日期
机器上已存在正常拉取的36m(x7/78m)kubelet、kube-apiserver-1容器映像“k8s.gcr.io/kube调度程序:v1.14.2”
正常启动36m(x6/74m)kubelet,kube-apiserver-1启动容器kube调度器
正常创建32m(x7/74m)kubelet,kube-apiserver-1创建容器kube调度器
警告:20m(x9超过40m)kubelet,kube-apiserver-1活动性探测失败:获取http://127.0.0.1:10251/healthz: 拨打tcp 127.0.0.1:10251:连接:连接被拒绝
警告后退2m56s(x85超过69m)kubelet,kube-apiserver-1后退重新启动失败的容器
我觉得我忽略了一个简单的选项或配置,但我找不到它,在处理这个问题和阅读文档几天后,我不知所措
负载平衡器是一个TCP负载平衡器,它似乎按照预期工作,因为我可以从桌面查询集群
此时,欢迎您提供任何建议或故障排除提示
谢谢。我们配置的问题是,一位好心的技术人员决定取消kubernetes主防火墙上的一条规则,该规则阻止主防火墙循环回它需要探测的端口。这导致了各种奇怪的问题和误诊问题,这肯定是错误的方向。在我们允许服务器上的所有端口后,Kubernetes恢复了正常行为。kubectl日志kube-scheduler-kube-apiserver-1-n kube系统的输出是什么??由于我不能在这里发表长篇大论,我创建了以下要点:为什么有多个控制器管理器和scheduler吊舱?kube-scheduler-kube-apiserver-2是否存在相同的问题,还是仅与apiserver-1相关?我有一个api群集,目前只有两个成员,但一旦解决此问题,我将添加第三个成员。我们想在HA模式下运行kubernetes。这其实并不重要,因为即使我只有一台主服务器,这个问题也会发生。见: