Kubernetes 由于同步pod时出错,获取1/4 pod的CrashLoopBackOff错误

Kubernetes 由于同步pod时出错,获取1/4 pod的CrashLoopBackOff错误,kubernetes,kubelet,crashloopbackoff,Kubernetes,Kubelet,Crashloopbackoff,我收到1/4吊舱的CrashLoopBackOff错误,请指导我如何解决此问题 slotmachine-1688723297-5vlht 1/1 Running 0 21h 100.96.6.15 ip-172-21-61-42.compute.internal slotmachine-1688723297-6plr9 1/1 Running 0

我收到1/4吊舱的CrashLoopBackOff错误,请指导我如何解决此问题

slotmachine-1688723297-5vlht          1/1       Running            0          21h       100.96.6.15     ip-172-21-61-42.compute.internal
slotmachine-1688723297-6plr9          1/1       Running            0          16h       100.96.13.16    ip-172-21-54-247.compute.internal
slotmachine-1688723297-k995t          1/1       Running            0          16h       100.96.11.186   ip-172-21-60-180.compute.internal
slotmachine-1688723297-sk8bn          0/1       CrashLoopBackOff   8          19m       100.96.2.72     ip-172-21-56-148.compute.internal
$kubectl获得吊舱-n cog-prod01-o宽

slotmachine-1688723297-5vlht          1/1       Running            0          21h       100.96.6.15     ip-172-21-61-42.compute.internal
slotmachine-1688723297-6plr9          1/1       Running            0          16h       100.96.13.16    ip-172-21-54-247.compute.internal
slotmachine-1688723297-k995t          1/1       Running            0          16h       100.96.11.186   ip-172-21-60-180.compute.internal
slotmachine-1688723297-sk8bn          0/1       CrashLoopBackOff   8          19m       100.96.2.72     ip-172-21-56-148.compute.internal
节点上的Kubelet日志:

slotmachine-1688723297-5vlht          1/1       Running            0          21h       100.96.6.15     ip-172-21-61-42.compute.internal
slotmachine-1688723297-6plr9          1/1       Running            0          16h       100.96.13.16    ip-172-21-54-247.compute.internal
slotmachine-1688723297-k995t          1/1       Running            0          16h       100.96.11.186   ip-172-21-60-180.compute.internal
slotmachine-1688723297-sk8bn          0/1       CrashLoopBackOff   8          19m       100.96.2.72     ip-172-21-56-148.compute.internal
admin@ip-172-21-56-148:~$ journalctl -u kubelet -f

    Jan 07 02:44:36 ip-172-21-56-148 kubelet[1568]: W0107 02:44:36.351880    1568 helpers.go:793] eviction manager: no observation found for eviction signal allocatableNodeFs.available
    Jan 07 02:44:46 ip-172-21-56-148 kubelet[1568]: W0107 02:44:46.372270    1568 helpers.go:793] eviction manager: no observation found for eviction signal allocatableNodeFs.available
    Jan 07 02:44:46 ip-172-21-56-148 kubelet[1568]: I0107 02:44:46.443776    1568 kuberuntime_manager.go:463] Container {Name:slotmachine Image:gt/slotmachine:develop.6590.b3a.2866 Command:[] Args:[] WorkingDir: Ports:[{Name:slotmachine HostPort:0 ContainerPort:9192 Protocol:TCP HostIP:}] EnvFrom:[{Prefix: ConfigMapRef:&ConfigMapEnvSource{LocalObjectReference:LocalObjectReference{Name:global,},Optional:nil,} SecretRef:nil}] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI} memory:{i:{value:5 scale:9} d:{Dec:<nil>} s:5G Format:DecimalSI}]} VolumeMounts:[{Name:slotmachine-logs ReadOnly:false MountPath:/var/log/slotmachine SubPath:} {Name:default-token-9bxjf ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
    Jan 07 02:44:46 ip-172-21-56-148 kubelet[1568]: I0107 02:44:46.443851    1568 kuberuntime_manager.go:747] checking backoff for container "slotmachine" in pod "slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"
    Jan 07 02:44:46 ip-172-21-56-148 kubelet[1568]: I0107 02:44:46.592800    1568 kubelet.go:1917] SyncLoop (PLEG): "slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)", event: &pleg.PodLifecycleEvent{ID:"2bc8665e-30f5-11ea-a92d-024aeca0bafc", Type:"ContainerStarted", Data:"5b2868d22c3e5453e57a58cba78cea4979a7da9a0864be2f29049d47d19fa41b"}
    Jan 07 02:44:56 ip-172-21-56-148 kubelet[1568]: W0107 02:44:56.409374    1568 helpers.go:793] eviction manager: no observation found for eviction signal allocatableNodeFs.available
    Jan 07 02:45:00 ip-172-21-56-148 kubelet[1568]: I0107 02:45:00.669027    1568 kubelet.go:1917] SyncLoop (PLEG): "slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)", event: &pleg.PodLifecycleEvent{ID:"2bc8665e-30f5-11ea-a92d-024aeca0bafc", Type:"ContainerDied", Data:"5b2868d22c3e5453e57a58cba78cea4979a7da9a0864be2f29049d47d19fa41b"}
    Jan 07 02:45:00 ip-172-21-56-148 kubelet[1568]: I0107 02:45:00.971547    1568 kuberuntime_manager.go:463] Container {Name:slotmachine Image:gt/slotmachine:develop.6590.b3aa.2866 Command:[] Args:[] WorkingDir: Ports:[{Name:slotmachine HostPort:0 ContainerPort:9192 Protocol:TCP HostIP:}] EnvFrom:[{Prefix: ConfigMapRef:&ConfigMapEnvSource{LocalObjectReference:LocalObjectReference{Name:global,},Optional:nil,} SecretRef:nil}] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI} memory:{i:{value:5 scale:9} d:{Dec:<nil>} s:5G Format:DecimalSI}]} VolumeMounts:[{Name:slotmachine-logs ReadOnly:false MountPath:/var/log/slotmachine SubPath:} {Name:default-token-9bxjf ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
    Jan 07 02:45:00 ip-172-21-56-148 kubelet[1568]: I0107 02:45:00.971640    1568 kuberuntime_manager.go:747] checking backoff for container "slotmachine" in pod "slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"
    Jan 07 02:45:00 ip-172-21-56-148 kubelet[1568]: I0107 02:45:00.971770    1568 kuberuntime_manager.go:757] Back-off 5m0s restarting failed container=slotmachine pod=slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)
    Jan 07 02:45:00 ip-172-21-56-148 kubelet[1568]: E0107 02:45:00.971805    1568 pod_workers.go:182] Error syncing pod 2bc8665e-30f5-11ea-a92d-024aeca0bafc ("slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"), skipping: failed to "StartContainer" for "slotmachine" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=slotmachine pod=slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"
    Jan 07 02:45:06 ip-172-21-56-148 kubelet[1568]: W0107 02:45:06.447068    1568 helpers.go:793] eviction manager: no observation found for eviction signal allocatableNodeFs.available
    Jan 07 02:45:12 ip-172-21-56-148 kubelet[1568]: I0107 02:45:12.149685    1568 status_manager.go:418] Status for pod "2bc8665e-30f5-11ea-a92d-024aeca0bafc" is up-to-date; skipping
    Jan 07 02:45:12 ip-172-21-56-148 kubelet[1568]: I0107 02:45:12.443951    1568 kuberuntime_manager.go:463] Container {Name:slotmachine Image:gt/slotmachine:develop.6590.b35a.2866 Command:[] Args:[] WorkingDir: Ports:[{Name:slotmachine HostPort:0 ContainerPort:9192 Protocol:TCP HostIP:}] EnvFrom:[{Prefix: ConfigMapRef:&ConfigMapEnvSource{LocalObjectReference:LocalObjectReference{Name:global,},Optional:nil,} SecretRef:nil}] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI} memory:{i:{value:5 scale:9} d:{Dec:<nil>} s:5G Format:DecimalSI}]} VolumeMounts:[{Name:slotmachine-logs ReadOnly:false MountPath:/var/log/slotmachine SubPath:} {Name:default-token-9bxjf ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
    Jan 07 02:45:12 ip-172-21-56-148 kubelet[1568]: I0107 02:45:12.444070    1568 kuberuntime_manager.go:747] checking backoff for container "slotmachine" in pod "slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"
    Jan 07 02:45:12 ip-172-21-56-148 kubelet[1568]: I0107 02:45:12.444198    1568 kuberuntime_manager.go:757] Back-off 5m0s restarting failed container=slotmachine pod=slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)
    Jan 07 02:45:12 ip-172-21-56-148 kubelet[1568]: E0107 02:45:12.444238    1568 pod_workers.go:182] Error syncing pod 2bc8665e-30f5-11ea-a92d-024aeca0bafc ("slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"), skipping: failed to "StartContainer" for "slotmachine" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=slotmachine pod=slotmachine-1688723297-sk8bn_cog-prod01(2bc8665e-30f5-11ea-a92d-024aeca0bafc)"
    Jan 07 02:45:13 ip-172-21-56-148 kubelet[1568]: I0107 02:45:13.938976    1568 qos_container_manager_linux.go:286] [ContainerManager]: Updated QoS cgroup configuration
    Jan 07 02:45:16 ip-172-21-56-148 kubelet[1568]: W0107 02:45:16.464693    1568 helpers.go:793] eviction manager: no observation found for eviction signal allocatableNodeFs.available
注意:检查了运行该pod的节点上的磁盘空间、CPU和内存,一切正常。根据pod日志,它无法连接配置服务,但其他3个可以连接到此服务,因此无法找出这里的错误

slotmachine-1688723297-5vlht          1/1       Running            0          21h       100.96.6.15     ip-172-21-61-42.compute.internal
slotmachine-1688723297-6plr9          1/1       Running            0          16h       100.96.13.16    ip-172-21-54-247.compute.internal
slotmachine-1688723297-k995t          1/1       Running            0          16h       100.96.11.186   ip-172-21-60-180.compute.internal
slotmachine-1688723297-sk8bn          0/1       CrashLoopBackOff   8          19m       100.96.2.72     ip-172-21-56-148.compute.internal
admin@ip-172-21-43-86:~$ kubectl logs -n  cog-prod01 slotmachine-1688723297-sk8bn


03:01:02.104 [main] INFO  org.springframework.cloud.config.client.ConfigServicePropertySourceLocator - Fetching config from server at: http://configservice:8888
03:01:05.344 [main] WARN  org.springframework.cloud.config.client.ConfigServicePropertySourceLocator - Could not locate PropertySource: I/O error on GET request for "http://configservice:8888/slotmachine/cog,cog-prod01": No route to host (Host unreachable); nested exception is java.net.NoRouteToHostException: No route to host (Host unreachable)
03:01:05.381 [main] INFO  org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext - Refreshing org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext@77eca502: startup date [Tue Jan 07 03:01:05 UTC 2020]; parent: org.springframework.context.annotation.AnnotationConfigApplicationContext@4fb0f2b9

节点上的可用容量不足,因此计划程序无法部署第四个pod。您可以使用
kubectl descripe nodes
检查这一点。有关详细说明,请查看我对检查Kube代理是否在您的节点上正常工作的回答

slotmachine-1688723297-5vlht          1/1       Running            0          21h       100.96.6.15     ip-172-21-61-42.compute.internal
slotmachine-1688723297-6plr9          1/1       Running            0          16h       100.96.13.16    ip-172-21-54-247.compute.internal
slotmachine-1688723297-k995t          1/1       Running            0          16h       100.96.11.186   ip-172-21-60-180.compute.internal
slotmachine-1688723297-sk8bn          0/1       CrashLoopBackOff   8          19m       100.96.2.72     ip-172-21-56-148.compute.internal

以下是有关此节点上是否有足够空间的指南:已分配资源:(总限制可能超过100%,即超额使用。)CPU请求CPU限制内存请求内存限制-------------------------300m(7%)0(0%)5500M(32%)0(0%)我的意思是内存而不是上面的空间在我的情况下,每个节点上的Kube代理文件中都没有生成日志:$sudo tail-f/var/log/Kube-proxy.log您可以添加一些解释吗well@prashant对不起,我不确定我应该补充什么解释
slotmachine-1688723297-5vlht          1/1       Running            0          21h       100.96.6.15     ip-172-21-61-42.compute.internal
slotmachine-1688723297-6plr9          1/1       Running            0          16h       100.96.13.16    ip-172-21-54-247.compute.internal
slotmachine-1688723297-k995t          1/1       Running            0          16h       100.96.11.186   ip-172-21-60-180.compute.internal
slotmachine-1688723297-sk8bn          0/1       CrashLoopBackOff   8          19m       100.96.2.72     ip-172-21-56-148.compute.internal