kubernetes多主机备份和恢复
我有以下带有外部负载平衡器的多主Kubernetes群集的工作设置(Proxmox上有6个虚拟机): 集群工作正常,部署了metrics服务器,使用HPA进行了测试-所有工作都如预期 ksmaster1是使用以下命令创建的:kubernetes多主机备份和恢复,kubernetes,backup,restore,Kubernetes,Backup,Restore,我有以下带有外部负载平衡器的多主Kubernetes群集的工作设置(Proxmox上有6个虚拟机): 集群工作正常,部署了metrics服务器,使用HPA进行了测试-所有工作都如预期 ksmaster1是使用以下命令创建的: # kubeadm init --pod-network-cidr=10.244.0.0/16 --upload-certs --control-plane-endpoint 10.200.1.10 然后使用INIT output命令连接第二个masterksmaster
# kubeadm init --pod-network-cidr=10.244.0.0/16 --upload-certs --control-plane-endpoint 10.200.1.10
然后使用INIT output命令连接第二个masterksmaster2和两个workerksworker1和ksworker1
备份是根据本指南从ksmaster1创建的:
销毁虚拟机后,我尝试使用
# echo "Copy cetrificates" && \
scp /data/backup/${HOSTNAME}/pki/ca.crt /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/ca.key /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/front-proxy-ca.crt /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/front-proxy-ca.key /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/sa.key /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/sa.pub /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/etcd/ca.crt /etc/kubernetes/pki/etcd && \
scp /data/backup/${HOSTNAME}/pki/etcd/ca.key /etc/kubernetes/pki/etcd && \
chown -R root: /etc/kubernetes
# docker run --rm \
-v /data/backup/${HOSTNAME}:/backup \
-v /var/lib/etcd:/var/lib/etcd \
--env ETCDCTL_API=3 \
k8s.gcr.io/etcd:3.4.3-0 \
/bin/sh -c "etcdctl snapshot restore '/backup/etcd-snapshot-latest_${HOSTNAME}.db' ; mv /default.etcd/member/ /var/lib/etcd/"
# kubeadm init \
--ignore-preflight-errors=DirAvailable--var-lib-etcd \
--config /data/backup/${HOSTNAME}/kubeadm-config.yaml \
--upload-certs
它通过了好的。令人惊讶的是,即使它是单独的,“kubectl get nodes”和“kubectl get all-A”在备份时返回集群的完整输出
使用INIT output中的join命令,我可以加入工作节点,并且在安装CNI集群之后,它再次可以运行(尚未完全测试)。问题是,当我尝试使用INIT output中的join命令连接第二个masterksmaster2时。它在尝试更新etcd端点时达到状态并卡住,同时ksmaster1上的etcd开始崩溃,因为在ksmaster2上无法访问etcd,api服务器也崩溃,集群变得根本无法访问。以下是ksmaster2的join命令的输出:
[root@ksmaster2 ~ ]# kubeadm join 10.200.1.10:6443 --token r1qek2.vh3lhmyqnlmkax1h \
> --discovery-token-ca-cert-hash sha256:78e4340940d5e028bd5a51845f542713df391e33c0d7915170ce03451baae708 \
> --control-plane --certificate-key 4a680f3410c8ef991ad4cccd1c77d5eef2b138aa968c8a8bf0892b23fe34c9e0 --v=7
[...output omited...]
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I1010 02:50:06.980659 6012 loader.go:375] Config loaded from file: /etc/kubernetes/kubelet.conf
I1010 02:50:06.992725 6012 cert_rotation.go:137] Starting client certificate rotation controller
I1010 02:50:06.994879 6012 loader.go:375] Config loaded from file: /etc/kubernetes/kubelet.conf
I1010 02:50:06.996987 6012 kubelet.go:194] [kubelet-start] preserving the crisocket information for the node
I1010 02:50:06.997015 6012 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ksmaster2" as an annotation
I1010 02:50:07.497324 6012 round_trippers.go:420] GET https://10.200.1.10:6443/api/v1/nodes/ksmaster2?timeout=10s
I1010 02:50:07.497363 6012 round_trippers.go:427] Request Headers:
I1010 02:50:07.497375 6012 round_trippers.go:431] Accept: application/json, */*
I1010 02:50:07.497389 6012 round_trippers.go:431] User-Agent: kubeadm/v1.18.3 (linux/amd64) kubernetes/2e7996e
I1010 02:50:07.506315 6012 round_trippers.go:446] Response Status: 200 OK in 8 milliseconds
I1010 02:50:07.508605 6012 round_trippers.go:420] PATCH https://10.200.1.10:6443/api/v1/nodes/ksmaster2?timeout=10s
I1010 02:50:07.508622 6012 round_trippers.go:427] Request Headers:
I1010 02:50:07.508631 6012 round_trippers.go:431] Accept: application/json, */*
I1010 02:50:07.508640 6012 round_trippers.go:431] Content-Type: application/strategic-merge-patch+json
I1010 02:50:07.508648 6012 round_trippers.go:431] User-Agent: kubeadm/v1.18.3 (linux/amd64) kubernetes/2e7996e
I1010 02:50:07.518719 6012 round_trippers.go:446] Response Status: 200 OK in 10 milliseconds
I1010 02:50:07.519481 6012 local.go:130] creating etcd client that connects to etcd pods
I1010 02:50:07.519499 6012 etcd.go:178] retrieving etcd endpoints from "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation in etcd Pods
I1010 02:50:07.519581 6012 round_trippers.go:420] GET https://10.200.1.10:6443/api/v1/namespaces/kube-system/pods?labelSelector=component%3Detcd%2Ctier%3Dcontrol-plane
I1010 02:50:07.519593 6012 round_trippers.go:427] Request Headers:
I1010 02:50:07.519602 6012 round_trippers.go:431] Accept: application/json, */*
I1010 02:50:07.519611 6012 round_trippers.go:431] User-Agent: kubeadm/v1.18.3 (linux/amd64) kubernetes/2e7996e
I1010 02:50:07.524888 6012 round_trippers.go:446] Response Status: 200 OK in 5 milliseconds
I1010 02:50:07.525471 6012 etcd.go:102] etcd endpoints read from pods: https://10.200.1.13:2379,https://10.200.1.14:2379
I1010 02:50:07.535610 6012 etcd.go:250] etcd endpoints read from etcd: https://10.200.1.13:2379
I1010 02:50:07.535647 6012 etcd.go:120] update etcd endpoints: https://10.200.1.13:2379
I1010 02:50:07.535664 6012 local.go:139] Adding etcd member: https://10.200.1.14:2380
[etcd] Announced new etcd member joining to the existing etcd cluster
I1010 02:50:07.563900 6012 local.go:145] Updated etcd member list: [{ksmaster1 http://localhost:2380} {ksmaster2 https://10.200.1.14:2380}]
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I1010 02:50:07.564818 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 1/8
[kubelet-check] Initial timeout of 40s passed.
I1010 02:50:47.585360 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:51:27.639903 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:52:07.744758 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:52:47.953938 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:53:28.381918 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:54:09.187647 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:54:50.813114 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:55:34.044593 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:56:20.637561 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:57:14.097474 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:58:21.781379 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:58:21.781429 6012 etcd.go:516] [etcd] Attempt failed with error: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:58:21.781443 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 02:58:26.781670 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 2/8
I1010 02:59:06.782141 6012 etcd.go:514] [etcd] Attempt timed out
I1010 02:59:06.782170 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 02:59:11.782367 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 3/8
I1010 02:59:51.782793 6012 etcd.go:514] [etcd] Attempt timed out
I1010 02:59:51.782820 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 02:59:56.786003 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 4/8
I1010 03:00:36.786567 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:00:36.786622 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:00:41.786864 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 5/8
I1010 03:01:21.787390 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:01:21.787422 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:01:26.787558 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 6/8
I1010 03:02:06.787983 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:02:06.788010 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:02:11.789873 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 7/8
I1010 03:02:51.790301 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:02:51.790327 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:02:56.790533 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 8/8
I1010 03:03:36.790989 6012 etcd.go:514] [etcd] Attempt timed out
timeout waiting for etcd cluster to be available
k8s.io/kubernetes/cmd/kubeadm/app/util/etcd.(*Client).WaitForClusterAvailable
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/util/etcd/etcd.go:522
k8s.io/kubernetes/cmd/kubeadm/app/phases/etcd.CreateStackedEtcdStaticPodManifestFile
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/etcd/local.go:167
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runEtcdPhase
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join/controlplanejoin.go:143
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdJoin.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:170
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
error creating local etcd static pod manifest file
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runEtcdPhase
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join/controlplanejoin.go:144
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdJoin.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:170
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
error execution phase control-plane-join/etcd
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdJoin.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:170
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
[root@ksmaster2 ~ ]#
我尝试在restoreksmaster1之后逐出所有节点(即在创建集群时添加其他节点),尝试更改导出的kubeadm-config.yaml中的一些参数,但它始终停留在同一位置。最成功的案例是加入ksmaster2作为工作节点,然后将其转换为主节点,但此时,我认为etcd出现了一些问题-第一个主节点可以看到所有可用服务的主节点和工作节点,而第二个主节点只能看到自身和两个工作节点,以及一些有限的系统服务(如第二个集群已初始化)
所以,在这一点上,我没有任何想法,有人能帮我提供一些提示和想法集群的备份和恢复有什么问题吗
# echo "Copy cetrificates" && \
scp /data/backup/${HOSTNAME}/pki/ca.crt /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/ca.key /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/front-proxy-ca.crt /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/front-proxy-ca.key /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/sa.key /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/sa.pub /etc/kubernetes/pki && \
scp /data/backup/${HOSTNAME}/pki/etcd/ca.crt /etc/kubernetes/pki/etcd && \
scp /data/backup/${HOSTNAME}/pki/etcd/ca.key /etc/kubernetes/pki/etcd && \
chown -R root: /etc/kubernetes
# docker run --rm \
-v /data/backup/${HOSTNAME}:/backup \
-v /var/lib/etcd:/var/lib/etcd \
--env ETCDCTL_API=3 \
k8s.gcr.io/etcd:3.4.3-0 \
/bin/sh -c "etcdctl snapshot restore '/backup/etcd-snapshot-latest_${HOSTNAME}.db' ; mv /default.etcd/member/ /var/lib/etcd/"
# kubeadm init \
--ignore-preflight-errors=DirAvailable--var-lib-etcd \
--config /data/backup/${HOSTNAME}/kubeadm-config.yaml \
--upload-certs
[root@ksmaster2 ~ ]# kubeadm join 10.200.1.10:6443 --token r1qek2.vh3lhmyqnlmkax1h \
> --discovery-token-ca-cert-hash sha256:78e4340940d5e028bd5a51845f542713df391e33c0d7915170ce03451baae708 \
> --control-plane --certificate-key 4a680f3410c8ef991ad4cccd1c77d5eef2b138aa968c8a8bf0892b23fe34c9e0 --v=7
[...output omited...]
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I1010 02:50:06.980659 6012 loader.go:375] Config loaded from file: /etc/kubernetes/kubelet.conf
I1010 02:50:06.992725 6012 cert_rotation.go:137] Starting client certificate rotation controller
I1010 02:50:06.994879 6012 loader.go:375] Config loaded from file: /etc/kubernetes/kubelet.conf
I1010 02:50:06.996987 6012 kubelet.go:194] [kubelet-start] preserving the crisocket information for the node
I1010 02:50:06.997015 6012 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ksmaster2" as an annotation
I1010 02:50:07.497324 6012 round_trippers.go:420] GET https://10.200.1.10:6443/api/v1/nodes/ksmaster2?timeout=10s
I1010 02:50:07.497363 6012 round_trippers.go:427] Request Headers:
I1010 02:50:07.497375 6012 round_trippers.go:431] Accept: application/json, */*
I1010 02:50:07.497389 6012 round_trippers.go:431] User-Agent: kubeadm/v1.18.3 (linux/amd64) kubernetes/2e7996e
I1010 02:50:07.506315 6012 round_trippers.go:446] Response Status: 200 OK in 8 milliseconds
I1010 02:50:07.508605 6012 round_trippers.go:420] PATCH https://10.200.1.10:6443/api/v1/nodes/ksmaster2?timeout=10s
I1010 02:50:07.508622 6012 round_trippers.go:427] Request Headers:
I1010 02:50:07.508631 6012 round_trippers.go:431] Accept: application/json, */*
I1010 02:50:07.508640 6012 round_trippers.go:431] Content-Type: application/strategic-merge-patch+json
I1010 02:50:07.508648 6012 round_trippers.go:431] User-Agent: kubeadm/v1.18.3 (linux/amd64) kubernetes/2e7996e
I1010 02:50:07.518719 6012 round_trippers.go:446] Response Status: 200 OK in 10 milliseconds
I1010 02:50:07.519481 6012 local.go:130] creating etcd client that connects to etcd pods
I1010 02:50:07.519499 6012 etcd.go:178] retrieving etcd endpoints from "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation in etcd Pods
I1010 02:50:07.519581 6012 round_trippers.go:420] GET https://10.200.1.10:6443/api/v1/namespaces/kube-system/pods?labelSelector=component%3Detcd%2Ctier%3Dcontrol-plane
I1010 02:50:07.519593 6012 round_trippers.go:427] Request Headers:
I1010 02:50:07.519602 6012 round_trippers.go:431] Accept: application/json, */*
I1010 02:50:07.519611 6012 round_trippers.go:431] User-Agent: kubeadm/v1.18.3 (linux/amd64) kubernetes/2e7996e
I1010 02:50:07.524888 6012 round_trippers.go:446] Response Status: 200 OK in 5 milliseconds
I1010 02:50:07.525471 6012 etcd.go:102] etcd endpoints read from pods: https://10.200.1.13:2379,https://10.200.1.14:2379
I1010 02:50:07.535610 6012 etcd.go:250] etcd endpoints read from etcd: https://10.200.1.13:2379
I1010 02:50:07.535647 6012 etcd.go:120] update etcd endpoints: https://10.200.1.13:2379
I1010 02:50:07.535664 6012 local.go:139] Adding etcd member: https://10.200.1.14:2380
[etcd] Announced new etcd member joining to the existing etcd cluster
I1010 02:50:07.563900 6012 local.go:145] Updated etcd member list: [{ksmaster1 http://localhost:2380} {ksmaster2 https://10.200.1.14:2380}]
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I1010 02:50:07.564818 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 1/8
[kubelet-check] Initial timeout of 40s passed.
I1010 02:50:47.585360 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:51:27.639903 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:52:07.744758 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:52:47.953938 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:53:28.381918 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:54:09.187647 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:54:50.813114 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:55:34.044593 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:56:20.637561 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:57:14.097474 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:58:21.781379 6012 etcd.go:489] Failed to get etcd status for https://10.200.1.14:2379: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:58:21.781429 6012 etcd.go:516] [etcd] Attempt failed with error: failed to dial endpoint https://10.200.1.14:2379 with maintenance client: context deadline exceeded
I1010 02:58:21.781443 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 02:58:26.781670 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 2/8
I1010 02:59:06.782141 6012 etcd.go:514] [etcd] Attempt timed out
I1010 02:59:06.782170 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 02:59:11.782367 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 3/8
I1010 02:59:51.782793 6012 etcd.go:514] [etcd] Attempt timed out
I1010 02:59:51.782820 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 02:59:56.786003 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 4/8
I1010 03:00:36.786567 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:00:36.786622 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:00:41.786864 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 5/8
I1010 03:01:21.787390 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:01:21.787422 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:01:26.787558 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 6/8
I1010 03:02:06.787983 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:02:06.788010 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:02:11.789873 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 7/8
I1010 03:02:51.790301 6012 etcd.go:514] [etcd] Attempt timed out
I1010 03:02:51.790327 6012 etcd.go:506] [etcd] Waiting 5s until next retry
I1010 03:02:56.790533 6012 etcd.go:509] [etcd] attempting to see if all cluster endpoints ([https://10.200.1.13:2379 https://10.200.1.14:2379]) are available 8/8
I1010 03:03:36.790989 6012 etcd.go:514] [etcd] Attempt timed out
timeout waiting for etcd cluster to be available
k8s.io/kubernetes/cmd/kubeadm/app/util/etcd.(*Client).WaitForClusterAvailable
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/util/etcd/etcd.go:522
k8s.io/kubernetes/cmd/kubeadm/app/phases/etcd.CreateStackedEtcdStaticPodManifestFile
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/phases/etcd/local.go:167
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runEtcdPhase
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join/controlplanejoin.go:143
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdJoin.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:170
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
error creating local etcd static pod manifest file
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runEtcdPhase
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join/controlplanejoin.go:144
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdJoin.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:170
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
error execution phase control-plane-join/etcd
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdJoin.func1
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/join.go:170
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864
k8s.io/kubernetes/cmd/kubeadm/app.Run
/workspace/anago-v1.18.3-beta.0.58+d6e40f410ca91c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
[root@ksmaster2 ~ ]#