2026-03-11Cluster Architecture

HA Control Plane

High-availability control plane topologies, load balancer setup, and multi-master kubeadm configuration.

Overview

A single control plane node is a single point of failure — if it goes down, the cluster cannot be managed
HA control plane runs multiple control plane nodes (minimum 3) behind a load balancer
etcd requires an odd number of members (3 or 5) for quorum — majority must agree to commit writes
Quorum formula: (n/2) + 1 — a 3-member cluster tolerates 1 failure, 5-member tolerates 2

HA Topologies

	Stacked etcd	External etcd
Description	etcd runs on the same nodes as control plane components	etcd runs on dedicated separate nodes
Minimum nodes	3 control plane nodes	3 control plane + 3 etcd nodes (6 total)
Pros	Simpler to set up and manage, fewer nodes needed	etcd failures don't directly impact control plane nodes, can scale etcd independently
Cons	Losing a node loses both a CP member and an etcd member	More infrastructure required, more complex setup
Use case	Most clusters, default kubeadm HA setup	Large production clusters requiring maximum resilience

Load Balancer Requirements

All API server instances must be fronted by a load balancer on port 6443
The LB address becomes the --control-plane-endpoint used by all nodes to reach the API
Health check endpoint: https://<api-server>:6443/healthz
Must be a TCP or HTTPS load balancer (not HTTP)

Common options:

Option	Type	Notes
HAProxy	External LB	Traditional, well-documented for k8s HA
kube-vip	Virtual IP	Runs as static pod on CP nodes, no external infra needed
Cloud LB	External LB	AWS NLB/ALB, GCP LB, Azure LB — managed by cloud provider

Setting Up HA with kubeadm

1. Set up the load balancer

Configure your load balancer to point to all control plane node IPs on port 6443.

2. Initialize the first control plane node

sudo kubeadm init \
  --control-plane-endpoint "LOAD_BALANCER_IP:6443" \
  --upload-certs \
  --pod-network-cidr=10.244.0.0/16

--control-plane-endpoint — stable address (LB IP/DNS) all nodes use to reach the API server
--upload-certs — encrypts and uploads certificates to a kubeadm-certs Secret so other CP nodes can retrieve them

Save the output — it contains the join commands for both control plane and worker nodes.

3. Install a CNI plugin

# example: Calico
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml

4. Join additional control plane nodes

sudo kubeadm join LOAD_BALANCER_IP:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash> \
  --control-plane \
  --certificate-key <certificate-key>

--control-plane — tells kubeadm this node joins as a control plane member (not a worker)
--certificate-key — key to decrypt the certificates uploaded in step 2

5. Join worker nodes

sudo kubeadm join LOAD_BALANCER_IP:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>

Verifying HA Setup

# all control plane nodes should show role control-plane and be Ready
kubectl get nodes

# check etcd cluster health (run on a control plane node)
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list -w table

# check etcd endpoint status (leader, DB size, etc.)
sudo ETCDCTL_API=3 etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  endpoint status -w table

Useful Commands

# regenerate join token (expires after 24h)
kubeadm token create --print-join-command

# regenerate certificate key for adding new control plane nodes
sudo kubeadm init phase upload-certs --upload-certs

# check certificate expiration
sudo kubeadm certs check-expiration

# verify all control plane pods are running
kubectl get pods -n kube-system -l tier=control-plane

# check kube-apiserver endpoints
kubectl get endpoints kubernetes -o yaml