Kubernetes k8s pod就绪探测失败,连接被拒绝,但pod服务请求很好
我很难理解为什么pods就绪探测失败了Kubernetes k8s pod就绪探测失败,连接被拒绝,但pod服务请求很好,kubernetes,openshift,Kubernetes,Openshift,我很难理解为什么pods就绪探测失败了 Warning Unhealthy 21m (x2 over 21m) kubelet, REDACTED Readiness probe failed: Get http://192.168.209.74:8081/actuator/health: dial tcp 192.168.209.74:8081: connect: connection refused 如果我执行到这个pod(或者实际上执行到该应用程序的任何其他pod),我可以针
Warning Unhealthy 21m (x2 over 21m) kubelet, REDACTED Readiness probe failed: Get http://192.168.209.74:8081/actuator/health: dial tcp 192.168.209.74:8081: connect: connection refused
如果我执行到这个pod(或者实际上执行到该应用程序的任何其他pod),我可以针对该URL运行一个curl,而不会出现问题:
kubectl exec -it REDACTED-l2z5w /bin/bash
$ curl -v http://192.168.209.74:8081/actuator/health
$ curl -v http://192.168.209.74:8081/actuator/health
* Expire in 0 ms for 6 (transfer 0x5611b949ff50)
* Trying 192.168.209.74...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x5611b949ff50)
* Connected to 192.168.209.74 (192.168.209.74) port 8081 (#0)
> GET /actuator/health HTTP/1.1
> Host: 192.168.209.74:8081
> User-Agent: curl/7.64.0
> Accept: */*
>
< HTTP/1.1 200
< Set-Cookie: CM_SESSIONID=E62390F0FF8C26D51C767835988AC690; Path=/; HttpOnly
< X-Content-Type-Options: nosniff
< X-XSS-Protection: 1; mode=block
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-Frame-Options: DENY
< Content-Type: application/vnd.spring-boot.actuator.v3+json
< Transfer-Encoding: chunked
< Date: Tue, 02 Jun 2020 15:07:21 GMT
<
* Connection #0 to host 192.168.209.74 left intact
{"status":"UP",...REDACTED..}
舵图有以下配置:
readinessProbe:
failureThreshold: 10
httpGet:
path: /actuator/health
port: 8081
scheme: HTTP
initialDelaySeconds: 20
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 3
我不能完全排除HTTP代理设置是罪魁祸首,但k8s文档说,自v1.13以来,HTTP_代理在检查中被忽略,因此它不应该在本地发生
OpenShift k8s的版本是1.11,我的本地版本是1.16。描述事件时总是显示您正在检查的资源上的最后一个事件。问题是,上次记录的事件是在检查
readinessProbe
时发生的错误
我在实验室用以下pod清单测试了它:
apiVersion:v1
种类:豆荚
元数据:
姓名:战备执行官
规格:
容器:
-名称:准备就绪
图片:k8s.gcr.io/busybox
args:
-/bin/sh
--c
-睡眠30分钟;触摸/tmp/健康;睡600
readinessProbe:
执行官:
命令:
-猫
-/tmp/health
初始延迟秒数:5
秒:5
可以看出,30秒后将在pod中创建一个文件/tmp/health
,而readinessProbe
将在5秒后检查该文件是否存在,并在每5秒后重复检查
描述这个吊舱会告诉我:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m56s default-scheduler Successfully assigned default/readiness-exec to yaki-118-2
Normal Pulling 7m55s kubelet, yaki-118-2 Pulling image "k8s.gcr.io/busybox"
Normal Pulled 7m55s kubelet, yaki-118-2 Successfully pulled image "k8s.gcr.io/busybox"
Normal Created 7m55s kubelet, yaki-118-2 Created container readiness
Normal Started 7m55s kubelet, yaki-118-2 Started container readiness
Warning Unhealthy 7m25s (x6 over 7m50s) kubelet, yaki-118-2 Readiness probe failed: cat: can't open '/tmp/healthy': No such file or directory
readinessProbe
查找该文件6次都没有成功,它完全正确,因为我将其配置为每5秒检查一次,并且该文件是在30秒后创建的
你所认为的问题实际上是预期的行为。您的事件告诉您,
readinessProbe
在21分钟前检查失败。这意味着你的豆荚在21分钟前就已经健康了 您能从节点点击该url吗?kubelet在节点上的日志还说明了什么吗?kubelet使用就绪探测来了解容器何时准备好开始接受流量
,kubelet在节点上运行。查看是否可以从节点访问url。网络插件可能有问题。谢谢,我刚刚得出了同样的结论。我认为我的pod不健康,这导致服务无法路由请求,但事实证明我的Traefik入口控制器配置错误,产生了404。
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m56s default-scheduler Successfully assigned default/readiness-exec to yaki-118-2
Normal Pulling 7m55s kubelet, yaki-118-2 Pulling image "k8s.gcr.io/busybox"
Normal Pulled 7m55s kubelet, yaki-118-2 Successfully pulled image "k8s.gcr.io/busybox"
Normal Created 7m55s kubelet, yaki-118-2 Created container readiness
Normal Started 7m55s kubelet, yaki-118-2 Started container readiness
Warning Unhealthy 7m25s (x6 over 7m50s) kubelet, yaki-118-2 Readiness probe failed: cat: can't open '/tmp/healthy': No such file or directory