Cluster Architecture

HA Control Plane

High-availability control plane topologies, load balancer setup, and multi-master kubeadm configuration.

Overview

  • A single control plane node is a single point of failure — if it goes down, the cluster cannot be managed
  • HA control plane runs multiple control plane nodes (minimum 3) behind a load balancer
  • etcd requires an odd number of members (3 or 5) for quorum — majority must agree to commit writes
  • Quorum formula: (n/2) + 1 — a 3-member cluster tolerates 1 failure, 5-member tolerates 2

HA Topologies

Stacked etcdExternal etcd
Descriptionetcd runs on the same nodes as control plane componentsetcd runs on dedicated separate nodes
Minimum nodes3 control plane nodes3 control plane + 3 etcd nodes (6 total)
ProsSimpler to set up and manage, fewer nodes neededetcd failures don't directly impact control plane nodes, can scale etcd independently
ConsLosing a node loses both a CP member and an etcd memberMore infrastructure required, more complex setup
Use caseMost clusters, default kubeadm HA setupLarge production clusters requiring maximum resilience

Load Balancer Requirements

  • All API server instances must be fronted by a load balancer on port 6443
  • The LB address becomes the --control-plane-endpoint used by all nodes to reach the API
  • Health check endpoint: https://<api-server>:6443/healthz
  • Must be a TCP or HTTPS load balancer (not HTTP)

Common options:

OptionTypeNotes
HAProxyExternal LBTraditional, well-documented for k8s HA
kube-vipVirtual IPRuns as static pod on CP nodes, no external infra needed
Cloud LBExternal LBAWS NLB/ALB, GCP LB, Azure LB — managed by cloud provider

Setting Up HA with kubeadm

1. Set up the load balancer

Configure your load balancer to point to all control plane node IPs on port 6443.

2. Initialize the first control plane node

sudo kubeadm init \
  --control-plane-endpoint "LOAD_BALANCER_IP:6443" \
  --upload-certs \
  --pod-network-cidr=10.244.0.0/16
  • --control-plane-endpoint — stable address (LB IP/DNS) all nodes use to reach the API server
  • --upload-certs — encrypts and uploads certificates to a kubeadm-certs Secret so other CP nodes can retrieve them

Save the output — it contains the join commands for both control plane and worker nodes.

3. Install a CNI plugin

# example: Calico
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml

4. Join additional control plane nodes

sudo kubeadm join LOAD_BALANCER_IP:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash> \
  --control-plane \
  --certificate-key <certificate-key>
  • --control-plane — tells kubeadm this node joins as a control plane member (not a worker)
  • --certificate-key — key to decrypt the certificates uploaded in step 2

5. Join worker nodes

sudo kubeadm join LOAD_BALANCER_IP:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>

Verifying HA Setup

# all control plane nodes should show role control-plane and be Ready
kubectl get nodes

# check etcd cluster health (run on a control plane node)
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list -w table

# check etcd endpoint status (leader, DB size, etc.)
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint status -w table

Useful Commands

# regenerate join token (expires after 24h)
kubeadm token create --print-join-command

# regenerate certificate key for adding new control plane nodes
sudo kubeadm init phase upload-certs --upload-certs

# check certificate expiration
sudo kubeadm certs check-expiration

# verify all control plane pods are running
kubectl get pods -n kube-system -l tier=control-plane

# check kube-apiserver endpoints
kubectl get endpoints kubernetes -o yaml