Cluster Architecture
Monitoring
metrics-server installation, kubectl top for node and pod resource usage.
metrics-server
Overview
- Cluster-wide aggregator of resource usage data (CPU and memory)
- Collects metrics from kubelets on each node
- Required for
kubectl topand Horizontal Pod Autoscaler (HPA) to work - Does not store historical data — provides current point-in-time metrics only
Installing metrics-server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
In test/lab environments, you may need to add
--kubelet-insecure-tlsto the metrics-server deployment args.
Verifying metrics-server
# check if metrics-server pod is running
kubectl get pods -n kube-system | grep metrics-server
# check if the API is available
kubectl get apiservices | grep metrics
# should show v1beta1.metrics.k8s.io with AVAILABLE=True
kubectl top
Node metrics
# show CPU and memory usage for all nodes
kubectl top nodes
# sort by CPU usage
kubectl top nodes --sort-by=cpu
# sort by memory usage
kubectl top nodes --sort-by=memory
Pod metrics
# show CPU and memory usage for pods in current namespace
kubectl top pods
# show metrics for all namespaces
kubectl top pods -A
# show metrics for a specific pod
kubectl top pod <pod-name>
# show container-level metrics
kubectl top pod <pod-name> --containers
# sort by CPU usage
kubectl top pods --sort-by=cpu
# sort by memory usage
kubectl top pods --sort-by=memory
# show metrics for pods with a specific label
kubectl top pods -l app=nginx
Resource Metrics Pipeline
How it works
kubelet (cAdvisor) ──> metrics-server ──> Metrics API (metrics.k8s.io)
│
kubectl top / HPA
- Each kubelet has a built-in cAdvisor that collects CPU and memory usage for containers
- metrics-server scrapes the kubelet
/metrics/resourceendpoint on every node - metrics-server exposes the aggregated data through the Metrics API (
metrics.k8s.io) kubectl topand HPA both consume the Metrics API
Querying the Metrics API directly
# node metrics
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | jq .
# pod metrics across all namespaces
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq .
# pod metrics for a specific namespace
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods | jq .
# check if the Metrics API is available
kubectl api-resources | grep metrics
Container Logging
How logging works
- Containers write to stdout and stderr
- The container runtime captures these streams and writes them to log files on the node
kubectl logsreads these files via the kubelet
Log file locations on the node
| Log | Location |
|---|---|
| Container logs (pods) | /var/log/pods/<namespace>_<pod>_<uid>/<container>/ |
| Container logs (symlinks) | /var/log/containers/<pod>_<namespace>_<container>-<id>.log |
| kubelet | /var/log/kubelet.log or journalctl -u kubelet |
| kube-proxy | /var/log/kube-proxy.log or journalctl -u kube-proxy |
| containerd | journalctl -u containerd |
| kube-apiserver | /var/log/kube-apiserver.log (or pod logs if kubeadm) |
kubectl logs
# basic usage
kubectl logs <pod-name>
# follow (stream) logs
kubectl logs <pod-name> -f
# show only the last N lines
kubectl logs <pod-name> --tail=100
# show logs since a duration
kubectl logs <pod-name> --since=1h
# show logs from a previous (crashed) container
kubectl logs <pod-name> --previous
# show logs from a specific container in a multi-container pod
kubectl logs <pod-name> -c <container-name>
# show logs from all containers in a pod
kubectl logs <pod-name> --all-containers=true
# show logs for all pods matching a label
kubectl logs -l app=nginx
# show logs for all pods matching a label with timestamps
kubectl logs -l app=nginx --timestamps=true
Log rotation
- kubelet manages container log rotation via two settings:
containerLogMaxSize— max size per log file (default10Mi)containerLogMaxFiles— max number of log files per container (default5)
- These can be set in the kubelet config file (
/var/lib/kubelet/config.yaml)
crictl logs
On the node, you can also view container logs directly with crictl:
# view logs for a container
sudo crictl logs <container-id>
# follow logs
sudo crictl logs -f <container-id>
# show last N lines
sudo crictl logs --tail=50 <container-id>
Useful Commands
# quick check — is metrics-server working?
kubectl top nodes
kubectl top pods -A
# check metrics-server logs if not working
kubectl logs -n kube-system deployment/metrics-server
# query Metrics API directly
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | jq .
# view pod logs
kubectl logs <pod-name> -f --tail=100
# view previous container logs (after a crash)
kubectl logs <pod-name> --previous
# view logs for all pods with a label
kubectl logs -l app=nginx --all-containers=true
# check container logs on node
sudo crictl logs <container-id>
# check kubelet logs on node
journalctl -u kubelet --no-pager -n 100