Ubuntu 库伯内特斯上的领事:领事舱正在运行,但还没有准备好
我正在我的Mac电脑上运行的Ubuntu虚拟机中创建一个3节点集群。他们应该这样做:Ubuntu 库伯内特斯上的领事:领事舱正在运行,但还没有准备好,ubuntu,kubernetes,kubernetes-helm,consul,Ubuntu,Kubernetes,Kubernetes Helm,Consul,我正在我的Mac电脑上运行的Ubuntu虚拟机中创建一个3节点集群。他们应该这样做: NAME STATUS ROLES AGE VERSION kind-control-plane Ready master 20h v1.17.0 kind-worker Ready <none> 20h v1.17.0 kind-worker2 Ready <none>
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready master 20h v1.17.0
kind-worker Ready <none> 20h v1.17.0
kind-worker2 Ready <none> 20h v1.17.0
以下是吊舱的完整描述:
Name: busybox-6cd57fd969-9tzmf
Namespace: default
Priority: 0
Node: kind-worker2/172.17.0.4
Start Time: Tue, 14 Jan 2020 17:45:03 +0800
Labels: pod-template-hash=6cd57fd969
run=busybox
Annotations: <none>
Status: Running
IP: 10.244.2.11
IPs:
IP: 10.244.2.11
Controlled By: ReplicaSet/busybox-6cd57fd969
Containers:
busybox:
Container ID: containerd://710eba6a12607021098e3c376637476cd85faf86ac9abcf10f191126dc37026b
Image: busybox
Image ID: docker.io/library/busybox@sha256:6915be4043561d64e0ab0f8f098dc2ac48e077fe23f488ac24b665166898115a
Port: <none>
Host Port: <none>
Args:
sh
State: Running
Started: Tue, 14 Jan 2020 21:00:50 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-zszqr (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-zszqr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-zszqr
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
Name: hashicorp-consul-hgxdr
Namespace: default
Priority: 0
Node: kind-worker2/172.17.0.4
Start Time: Tue, 14 Jan 2020 17:13:57 +0800
Labels: app=consul
chart=consul-helm
component=client
controller-revision-hash=6bc54657b6
hasDNS=true
pod-template-generation=1
release=hashicorp
Annotations: consul.hashicorp.com/connect-inject: false
Status: Running
IP: 10.244.2.10
IPs:
IP: 10.244.2.10
Controlled By: DaemonSet/hashicorp-consul
Containers:
consul:
Container ID: containerd://2209cfeaa740e3565213de6d0653dabbe9a8cbf1ffe085352a8e9d3a2d0452ec
Image: consul:1.6.2
Image ID: docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
Ports: 8500/TCP, 8502/TCP, 8301/TCP, 8301/UDP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 8500/TCP, 8502/TCP, 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="hashicorp-consul"
exec /bin/consul agent \
-node="${NODE}" \
-advertise="${ADVERTISE_IP}" \
-bind=0.0.0.0 \
-client=0.0.0.0 \
-node-meta=pod-name:${HOSTNAME} \
-hcl="ports { grpc = 8502 }" \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-domain=consul
State: Running
Started: Tue, 14 Jan 2020 20:58:29 +0800
Ready: False
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ADVERTISE_IP: (v1:status.podIP)
NAMESPACE: default (v1:metadata.namespace)
NODE: (v1:spec.nodeName)
Mounts:
/consul/config from config (rw)
/consul/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-client-token-4r5tv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hashicorp-consul-client-config
Optional: false
hashicorp-consul-client-token-4r5tv:
Type: Secret (a volume populated by a Secret)
SecretName: hashicorp-consul-client-token-4r5tv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 96s (x3206 over 14h) kubelet, kind-worker2 Readiness probe failed:
Name: hashicorp-consul-server-0
Namespace: default
Priority: 0
Node: kind-worker2/172.17.0.4
Start Time: Tue, 14 Jan 2020 17:13:57 +0800
Labels: app=consul
chart=consul-helm
component=server
controller-revision-hash=hashicorp-consul-server-98f4fc994
hasDNS=true
release=hashicorp
statefulset.kubernetes.io/pod-name=hashicorp-consul-server-0
Annotations: consul.hashicorp.com/connect-inject: false
Status: Running
IP: 10.244.2.9
IPs:
IP: 10.244.2.9
Controlled By: StatefulSet/hashicorp-consul-server
Containers:
consul:
Container ID: containerd://72b7bf0e81d3ed477f76b357743e9429325da0f38ccf741f53c9587082cdfcd0
Image: consul:1.6.2
Image ID: docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="hashicorp-consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-bootstrap-expect=3 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-domain=consul \
-hcl="connect { enabled = true }" \
-ui \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-server
State: Running
Started: Tue, 14 Jan 2020 20:58:27 +0800
Ready: False
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: default (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data-default (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-server-token-hhdxc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data-default:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-default-hashicorp-consul-server-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hashicorp-consul-server-config
Optional: false
hashicorp-consul-server-token-hhdxc:
Type: Secret (a volume populated by a Secret)
SecretName: hashicorp-consul-server-token-hhdxc
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 97s (x10686 over 14h) kubelet, kind-worker2 Readiness probe failed:
Name: hashicorp-consul-server-1
Namespace: default
Priority: 0
Node: kind-worker/172.17.0.3
Start Time: Tue, 14 Jan 2020 17:13:57 +0800
Labels: app=consul
chart=consul-helm
component=server
controller-revision-hash=hashicorp-consul-server-98f4fc994
hasDNS=true
release=hashicorp
statefulset.kubernetes.io/pod-name=hashicorp-consul-server-1
Annotations: consul.hashicorp.com/connect-inject: false
Status: Running
IP: 10.244.1.8
IPs:
IP: 10.244.1.8
Controlled By: StatefulSet/hashicorp-consul-server
Containers:
consul:
Container ID: containerd://c1f5a88e30e545c75e58a730be5003cee93c823c21ebb29b22b79cd151164a15
Image: consul:1.6.2
Image ID: docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="hashicorp-consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-bootstrap-expect=3 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-domain=consul \
-hcl="connect { enabled = true }" \
-ui \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-server
State: Running
Started: Tue, 14 Jan 2020 20:58:36 +0800
Ready: False
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: default (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data-default (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-server-token-hhdxc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data-default:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-default-hashicorp-consul-server-1
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hashicorp-consul-server-config
Optional: false
hashicorp-consul-server-token-hhdxc:
Type: Secret (a volume populated by a Secret)
SecretName: hashicorp-consul-server-token-hhdxc
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 95s (x10683 over 14h) kubelet, kind-worker Readiness probe failed:
Name: hashicorp-consul-server-2
Namespace: default
Priority: 0
Node: <none>
Labels: app=consul
chart=consul-helm
component=server
controller-revision-hash=hashicorp-consul-server-98f4fc994
hasDNS=true
release=hashicorp
statefulset.kubernetes.io/pod-name=hashicorp-consul-server-2
Annotations: consul.hashicorp.com/connect-inject: false
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/hashicorp-consul-server
Containers:
consul:
Image: consul:1.6.2
Ports: 8500/TCP, 8301/TCP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="hashicorp-consul"
exec /bin/consul agent \
-advertise="${POD_IP}" \
-bind=0.0.0.0 \
-bootstrap-expect=3 \
-client=0.0.0.0 \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-domain=consul \
-hcl="connect { enabled = true }" \
-ui \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-server
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=5s timeout=5s period=3s #success=1 #failure=2
Environment:
POD_IP: (v1:status.podIP)
NAMESPACE: default (v1:metadata.namespace)
Mounts:
/consul/config from config (rw)
/consul/data from data-default (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-server-token-hhdxc (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data-default:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-default-hashicorp-consul-server-2
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hashicorp-consul-server-config
Optional: false
hashicorp-consul-server-token-hhdxc:
Type: Secret (a volume populated by a Secret)
SecretName: hashicorp-consul-server-token-hhdxc
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 63s (x434 over 18h) default-scheduler 0/3 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity.
Name: hashicorp-consul-vmsmt
Namespace: default
Priority: 0
Node: kind-worker/172.17.0.3
Start Time: Tue, 14 Jan 2020 17:13:57 +0800
Labels: app=consul
chart=consul-helm
component=client
controller-revision-hash=6bc54657b6
hasDNS=true
pod-template-generation=1
release=hashicorp
Annotations: consul.hashicorp.com/connect-inject: false
Status: Running
IP: 10.244.1.9
IPs:
IP: 10.244.1.9
Controlled By: DaemonSet/hashicorp-consul
Containers:
consul:
Container ID: containerd://d502870f3476ea074b059361bc52a2c68ced551f5743b8448926bdaa319aabb0
Image: consul:1.6.2
Image ID: docker.io/library/consul@sha256:a167e7222c84687c3e7f392f13b23d9f391cac80b6b839052e58617dab714805
Ports: 8500/TCP, 8502/TCP, 8301/TCP, 8301/UDP, 8302/TCP, 8300/TCP, 8600/TCP, 8600/UDP
Host Ports: 8500/TCP, 8502/TCP, 0/TCP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/UDP
Command:
/bin/sh
-ec
CONSUL_FULLNAME="hashicorp-consul"
exec /bin/consul agent \
-node="${NODE}" \
-advertise="${ADVERTISE_IP}" \
-bind=0.0.0.0 \
-client=0.0.0.0 \
-node-meta=pod-name:${HOSTNAME} \
-hcl="ports { grpc = 8502 }" \
-config-dir=/consul/config \
-datacenter=dc1 \
-data-dir=/consul/data \
-retry-join=${CONSUL_FULLNAME}-server-0.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-1.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-retry-join=${CONSUL_FULLNAME}-server-2.${CONSUL_FULLNAME}-server.${NAMESPACE}.svc \
-domain=consul
State: Running
Started: Tue, 14 Jan 2020 20:58:35 +0800
Ready: False
Restart Count: 0
Readiness: exec [/bin/sh -ec curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
ADVERTISE_IP: (v1:status.podIP)
NAMESPACE: default (v1:metadata.namespace)
NODE: (v1:spec.nodeName)
Mounts:
/consul/config from config (rw)
/consul/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hashicorp-consul-client-token-4r5tv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hashicorp-consul-client-config
Optional: false
hashicorp-consul-client-token-4r5tv:
Type: Secret (a volume populated by a Secret)
SecretName: hashicorp-consul-client-token-4r5tv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 88s (x3207 over 14h) kubelet, kind-worker Readiness probe failed:
非常感谢您的帮助。我复制了您的设置,创建了3节点集群(1个主集群和2个工作集群),并使用helm部署了Consor,看到了与您看到的相同的东西。所有的吊舱都在一个待处理的吊舱旁边运行 在statefulset对象中,您可以看到podAntiAffinity不允许在同一节点上调度2个或多个服务器pod。这就是为什么您会看到一个pod处于挂起状态 我有4种方法可以让你成功
节点角色.kubernetes.io/Master:NoSchedule
,它不允许在主节点上调度任何POD。您可以通过运行以下命令来删除此污点:kubectl污点节点种类控制平面节点角色.kubernetes.io/master:NoSchedule-
(注意减号,它告诉k8s删除污点),因此现在调度程序将能够调度留给此节点的一个Consor服务器podrequiredDuringSchedulingIgnoredDuringExecution
更改为preferredDuringSchedulingIgnoredDuringExecution
,因此不需要满足此关联规则,它只是首选规则如果有帮助,请告诉我。对于领事群集容错,建议的仲裁大小为3或5,请参阅 头盔图表上仲裁的默认值为3
副本(整数:3)-要运行的服务器代理的数量。
`affinity (string) - This value defines the affinity for server pods. It defaults to allowing only a single pod on each node, which minimizes risk of the cluster becoming unusable if a node is lost`
如果需要为每个节点运行更多POD,请参阅,将此值设置为null。
因此,您需要至少3个可调度的工作节点满足生产级部署中的关联性要求,才能安装仲裁值为3的Consor[如果选择将值更新为5,您需要将工作节点有效地增加到5]
它清楚地记录在values.yaml的helm图表上,在节点数较少的系统上运行时应该使用哪些值
通过减少副本计数
~/test/consul-helm$ cat values.yaml | grep -i replica
replicas: 3
bootstrapExpect: 3 # Should <= replicas count
# replicas. If you'd like a custom value, you can specify an override here.
你安装了什么CNI?kubectl获取pods-n kube系统输出?问题是否仅限于CONSOR pods?你能试着运行任何其他的pod,比如nginx,看看是否有效吗?谢谢你的回复。实际上,我暂时解决了这个问题,在Helm的value.yml文件中将concur服务器的数量减少到1个。我会尽快测试你的解决方案。
`affinity (string) - This value defines the affinity for server pods. It defaults to allowing only a single pod on each node, which minimizes risk of the cluster becoming unusable if a node is lost`
~/test/consul-helm$ cat values.yaml | grep -i replica
replicas: 3
bootstrapExpect: 3 # Should <= replicas count
# replicas. If you'd like a custom value, you can specify an override here.
~/test/consul-helm$ cat values.yaml | grep -i -A 8 affinity
# Affinity Settings
# Commenting out or setting as empty the affinity variable, will allow
# deployment to single node services such as Minikube
affinity: |
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: {{ template "consul.name" . }}
release: "{{ .Release.Name }}"
component: server
topologyKey: kubernetes.io/hostname