Kubernetes kube系统容器持续崩溃

Kubernetes kube系统容器持续崩溃,kubernetes,Kubernetes,我在主节点上用kubeadm init--pod network cidr=10.1.0.0/16初始化一个新集群,然后安装Calico,一切都正常: sysadm@master$ sudo kubectl get pods --all-namespaces -o wide [sudo] password for sysadm: NAMESPACE NAME READY STATUS REST

我在主节点上用
kubeadm init--pod network cidr=10.1.0.0/16
初始化一个新集群,然后安装Calico,一切都正常:

sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
[sudo] password for sysadm:
NAMESPACE     NAME                                            READY   STATUS    RESTARTS   AGE     IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                               2/2     Running   0          4m9s    192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                         1/1     Running   0          4m9s    10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                         1/1     Running   0          4m9s    10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   etcd-localhost.localdomain                      1/1     Running   0          3m4s    192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-apiserver-localhost.localdomain            1/1     Running   0          3m18s   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-controller-manager-localhost.localdomain   1/1     Running   0          3m23s   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                                1/1     Running   0          4m9s    192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-scheduler-localhost.localdomain            1/1     Running   0          3m11s   192.168.0.249   localhost.localdomain   <none>           <none>
你知道会发生什么吗?如何对此进行故障排除?我尝试使用
kubectl描述吊舱
,但吊舱一直在崩溃,当我能够获得一些信息时,我看不到任何东西可以引导我下一步去哪里调查

很抱歉,细节含糊不清。如果你能告诉我还有什么地方可以看,我可以发布更多的细节或者知道下一步要调查的地方


感谢您抽出时间:)

问题在于主机名。检查NODENAME列。它将主机名显示为localhost.localdomain


将主机名更新为k8s master或master。它应该会起作用。每个节点还应该有一个唯一的主机名,如node1、node2、node3等

控制平面吊舱的名称非常可疑,如
kube system etcd localhost.localdomain
,因为该主机的名称实际上不是
localhost.localdomain
;但是如果没有这些事故的日志,没有人会祈祷帮助you@MatthewLDaniel谢谢你指出这些事情。我怎样才能获得有用的日志?我正试图找出解决这个问题的下一步,但我不熟悉与kubernetes相关的工具;但是老实说,如果你对kubernetes和docker那么陌生,那么我实际上不会尝试拯救这个集群,因为认为它的名字是
localhost.localdomain
的节点病得很厉害。重新开始,使用EKS或GKE或Rancher或其他工具创建群集。@MatthewLDaniel我以前在ECSs和vSphere上安装过kubernetes群集,但没有遇到此问题。新的需求是使用ProxMox。我已经吹走这些节点多次,但同样的问题回来了。除了
kubectl get
kubectl description
我不知道当吊舱如此迅速地被销毁和重建时,我还能用什么来获取日志。您关于
etcd localhost.localdomain
的提示是非常有用的线索。我将对此进行更深入的调查。如果
etcd localhost.localdomain
可疑,它通常应该是什么样子?@MatthewLDaniel是
etcd localhost.localdomain
从网络或网络DNS服务器获取的东西?很抱歉,我不了解环境的其他部分,所以我正在尝试提出一个足够体面的问题来询问网络人员。谢谢,我会在早上的第一件事就是尝试一下,然后再报告。谢谢你指出主机名的问题。或者解决了这个问题。
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES   
kube-system   calico-node-ntzn2         2/2     Running            0          10m   192.168.0.182   localhost.localdomain   <none>           <none>            
kube-system   coredns-fb8b8dccf-hqmn2   0/1     CrashLoopBackOff   2          10m   10.1.0.2        localhost.localdomain   <none>           <none>            
kube-system   coredns-fb8b8dccf-nfgr5   0/1     CrashLoopBackOff   1          10m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running            0          10m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                            READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                               2/2     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                         0/1     CrashLoopBackOff   2          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                         0/1     CrashLoopBackOff   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   etcd-localhost.localdomain                      0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-apiserver-localhost.localdomain            0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-controller-manager-localhost.localdomain   0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                                1/1     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                      2/2     Running   0          11m   192.168.0.182   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                0/1     Running   3          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                0/1     Running   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                       1/1     Running   0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
kube-system   kube-scheduler-localhost.localdomain   0/1     Pending   0          0s    <none>          localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2         2/2     Running   0          11m   192.168.0.182   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2   1/1     Running   0          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5   1/1     Running   0          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running   0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                      2/2     Running   0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                0/1     Error     2          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                1/1     Running   0          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                       1/1     Running   0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-scheduler-localhost.localdomain   0/1     Pending   0          0s    <none>          localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                         READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2            2/2     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2      0/1     CrashLoopBackOff   2          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5      1/1     Running            0          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   etcd-localhost.localdomain   0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb             1/1     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                      2/2     Running   0          11m   192.168.0.182   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                0/1     Error     3          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                0/1     Error     2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-apiserver-localhost.localdomain   0/1     Pending   0          0s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                       1/1     Running   0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2         2/2     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2   1/1     Running            0          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5   0/1     CrashLoopBackOff   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2         2/2     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2   0/1     Running            3          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5   0/1     CrashLoopBackOff   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>