Kubernetes 库伯内特斯:经常得到;添加网络时出错:网络中没有可用的IP地址:cbr0“;

Kubernetes 库伯内特斯:经常得到;添加网络时出错:网络中没有可用的IP地址:cbr0“;,kubernetes,flannel,kubeadm,Kubernetes,Flannel,Kubeadm,我使用kubeadm在Ubuntu16.04LTS和flannel上建立了一个单节点Kubernetes集群 大多数情况下,一切正常,但每隔几天,集群就会进入无法安排新pod的状态-pod处于“挂起”状态,当我描述这些pod的pod时,我会收到如下错误消息: Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message ---------

我使用kubeadm在Ubuntu16.04LTS和flannel上建立了一个单节点Kubernetes集群

大多数情况下,一切正常,但每隔几天,集群就会进入无法安排新pod的状态-pod处于“挂起”状态,当我描述这些pod的pod时,我会收到如下错误消息:

Events:
  FirstSeen LastSeen    Count   From                SubObjectPath   Type        Reason      Message
  --------- --------    -----   ----                -------------   --------    ------      -------
  2m        2m      1   {default-scheduler }                Normal      Scheduled   Successfully assigned dex-1939802596-zt1r3 to superserver-03
  1m        2s      21  {kubelet superserver-03}            Warning     FailedSync  Error syncing pod, skipping: failed to "SetupNetwork" for "somepod-1939802596-zt1r3_somenamespace" with SetupNetworkError: "Failed to setup network for pod \"somepod-1939802596-zt1r3_somenamespace(167f8345-faeb-11e6-94f3-0cc47a9a5cf2)\" using network plugins \"cni\": no IP addresses available in network: cbr0; Skipping pod"
我找到了这个和他建议的解决办法。它确实有助于恢复(虽然需要几分钟),但问题会在一段时间后再次出现

我也遇到过这种情况,并使用建议的解决方法恢复了问题,但问题再次出现。另外,这不完全是我的情况,在找到解决办法后,问题就解决了…:\

技术细节:

kubeadm version: version.Info{Major:"1", Minor:"6+", GitVersion:"v1.6.0-alpha.0.2074+a092d8e0f95f52", GitCommit:"a092d8e0f95f5200f7ae2cba45c75ab42da36537", GitTreeState:"clean", BuildDate:"2016-12-13T17:03:18Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-15T06:34:56Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
使用以下命令启动群集:

kubeadm init --pod-network-cidr 10.244.0.0/16 --api-advertise-addresses 192.168.1.200

kubectl taint nodes --all dedicated-

kubectl -n kube-system apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
一些可能相关的系统日志(我得到了很多):

非常感谢

编辑:

我能够复制它。这似乎是kubelet CIDR中IP地址的耗尽。调查结果:

  • 首先,节点的podCIDR是(通过
    kubectl get node-o yaml
    获得的):
    podCIDR:10.244.0.0/24
    (顺便说一句,为什么不/16作为我在kubeadm commnad中设置的集群CIDR?)

  • 第二:

    $sudo ls-la/var/lib/cni/networks/cbr0 | wc-l

    256
    (也就是说,分配了256个IP,对吗?)

  • 但是,尽管我目前运行的Kubernetes吊舱和服务不超过256个,但这种情况还是发生了:

    $kubectl获取所有--所有名称空间| wc-l

    180

    ####(是的,这不仅包括播客和服务,还包括作业、部署和复制集)

那么,IP地址用完了吗?如何解决这个问题?不可能只有这些变通办法

再次感谢

编辑(2)


另一个相关问题:

目前,这是我发现的最好的解决方法:

我已经设置了一个cron作业,以便在@reboot上运行此脚本

在Docker守护进程重新启动的事件中,这个问题似乎已经通过垃圾收集POD的方法得到了解决,但是这个功能在我的集群中可能没有启用

几天前,新的better刚刚合并,所以我希望这个问题将在下一个Kubernetes 1.6.0版本中得到解决

Feb 23 11:07:49 server-03 kernel: [  155.480669] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
Feb 23 11:07:49 server-03 dockerd[1414]: time="2017-02-23T11:07:49.735590817+02:00" level=warning msg="Couldn't run auplink before unmount /var/lib/docker/aufs/mnt/89bb7abdb946d858e175d80d6e1d2fdce0262af8c7afa9c6ad9d776f1f5028c4-init: exec: \"auplink\": executable file not found in $PATH"
Feb 23 11:07:49 server-03 kernel: [  155.496599] aufs au_opts_verify:1597:dockerd[24704]: dirperm1 breaks the protection by the permission bits on the lower branch
Feb 23 11:07:49 server-03 systemd-udevd[29313]: Could not generate persistent MAC address for vethd4d85eac: No such file or directory
Feb 23 11:07:49 server-03 kubelet[1228]: E0223 11:07:49.756976    1228 cni.go:255] Error adding network: no IP addresses available in network: cbr0
Feb 23 11:07:49 server-03 kernel: [  155.514994] IPv6: eth0: IPv6 duplicate address fe80::835:deff:fe4f:c74d detected!
Feb 23 11:07:49 server-03 kernel: [  155.515380] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Feb 23 11:07:49 server-03 kernel: [  155.515588] device vethd4d85eac entered promiscuous mode
Feb 23 11:07:49 server-03 kernel: [  155.515643] cni0: port 34(vethd4d85eac) entered forwarding state
Feb 23 11:07:49 server-03 kernel: [  155.515663] cni0: port 34(vethd4d85eac) entered forwarding state
Feb 23 11:07:49 server-03 kubelet[1228]: E0223 11:07:49.757001    1228 cni.go:209] Error while adding to cni network: no IP addresses available in network: cbr0
Feb 23 11:07:49 server-03 kubelet[1228]: E0223 11:07:49.757056    1228 docker_manager.go:2201] Failed to setup network for pod "somepod-752955044-58g59_somenamespace(5d6c28e1-f8dd-11e6-9843-0cc47a9a5cf2)" using network plugins "cni": no IP addresses available in network: cbr0; Skipping pod