Cluster Architecture
etcd Backup & Restore
Backing up and restoring etcd snapshots, configuration, and health checks.
Overview
- etcd is the key-value store that holds all cluster data (pods, services, secrets, configmaps, etc.)
- Runs as a static pod on control plane nodes — managed by kubelet, not the API server
- Manifest location:
/etc/kubernetes/manifests/etcd.yaml - Data directory:
/var/lib/etcd(default) - Losing etcd = losing the entire cluster state
etcd Configuration
- All connection details can be found in the etcd static pod manifest
Find etcd endpoints, certs, and keys
# check the etcd pod manifest
cat /etc/kubernetes/manifests/etcd.yaml
Key flags to look for:
--listen-client-urls— etcd endpoint (e.g.,https://127.0.0.1:2379)--cert-file— etcd server certificate--key-file— etcd server key--trusted-ca-file— CA certificate
Check etcd health
ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
endpoint health
Backup
Take a snapshot
ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/etcd-backup.db
Verify the snapshot
ETCDCTL_API=3 etcdctl snapshot status /opt/etcd-backup.db --write-table
Restore
1. Restore the snapshot to a new data directory
ETCDCTL_API=3 etcdctl \
--data-dir=/var/lib/etcd-restored \
snapshot restore /opt/etcd-backup.db
2. Update the etcd static pod manifest
Edit /etc/kubernetes/manifests/etcd.yaml to point to the new data directory:
- Change
--data-dir=/var/lib/etcdto--data-dir=/var/lib/etcd-restored - Update the volume
hostPathfrom/var/lib/etcdto/var/lib/etcd-restored
# in the etcd pod spec
volumes:
- name: etcd-data
hostPath:
path: /var/lib/etcd-restored # updated path
type: DirectoryOrCreate
3. Wait for etcd to restart
- kubelet watches
/etc/kubernetes/manifests/and will automatically restart the etcd pod - It may take a minute for the API server to come back
4. Verify the restore
kubectl get pods -A
Useful Commands
# check etcd pod is running
kubectl get pods -n kube-system | grep etcd
# check etcd version
kubectl describe pod etcd-<control-plane> -n kube-system | grep Image
# list etcd members
ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
member list
# check if etcdctl is installed
etcdctl version
Notes
- Always use
ETCDCTL_API=3— version 2 API is deprecated.- The cert paths may vary depending on your cluster setup — always check the etcd pod manifest first.
- When restoring, use a new
--data-dirto avoid corrupting the existing data.