2026-03-11Cluster Architecture

Monitoring

metrics-server installation, kubectl top for node and pod resource usage.

metrics-server

Overview

Cluster-wide aggregator of resource usage data (CPU and memory)
Collects metrics from kubelets on each node
Required for kubectl top and Horizontal Pod Autoscaler (HPA) to work
Does not store historical data — provides current point-in-time metrics only

Installing metrics-server

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

In test/lab environments, you may need to add --kubelet-insecure-tls to the metrics-server deployment args.

Verifying metrics-server

# check if metrics-server pod is running
kubectl get pods -n kube-system | grep metrics-server

# check if the API is available
kubectl get apiservices | grep metrics

# should show v1beta1.metrics.k8s.io with AVAILABLE=True

kubectl top

Node metrics

# show CPU and memory usage for all nodes
kubectl top nodes

# sort by CPU usage
kubectl top nodes --sort-by=cpu

# sort by memory usage
kubectl top nodes --sort-by=memory

Pod metrics

# show CPU and memory usage for pods in current namespace
kubectl top pods

# show metrics for all namespaces
kubectl top pods -A

# show metrics for a specific pod
kubectl top pod <pod-name>

# show container-level metrics
kubectl top pod <pod-name> --containers

# sort by CPU usage
kubectl top pods --sort-by=cpu

# sort by memory usage
kubectl top pods --sort-by=memory

# show metrics for pods with a specific label
kubectl top pods -l app=nginx

Resource Metrics Pipeline

How it works

kubelet (cAdvisor) ──> metrics-server ──> Metrics API (metrics.k8s.io)
                                              │
                                    kubectl top / HPA

Each kubelet has a built-in cAdvisor that collects CPU and memory usage for containers
metrics-server scrapes the kubelet /metrics/resource endpoint on every node
metrics-server exposes the aggregated data through the Metrics API (metrics.k8s.io)
kubectl top and HPA both consume the Metrics API

Querying the Metrics API directly

# node metrics
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | jq .

# pod metrics across all namespaces
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq .

# pod metrics for a specific namespace
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/default/pods | jq .

# check if the Metrics API is available
kubectl api-resources | grep metrics

Container Logging

How logging works

Containers write to stdout and stderr
The container runtime captures these streams and writes them to log files on the node
kubectl logs reads these files via the kubelet

Log file locations on the node

Log	Location
Container logs (pods)	`/var/log/pods/<namespace>_<pod>_<uid>/<container>/`
Container logs (symlinks)	`/var/log/containers/<pod>_<namespace>_<container>-<id>.log`
kubelet	`/var/log/kubelet.log` or `journalctl -u kubelet`
kube-proxy	`/var/log/kube-proxy.log` or `journalctl -u kube-proxy`
containerd	`journalctl -u containerd`
kube-apiserver	`/var/log/kube-apiserver.log` (or pod logs if kubeadm)

kubectl logs

# basic usage
kubectl logs <pod-name>

# follow (stream) logs
kubectl logs <pod-name> -f

# show only the last N lines
kubectl logs <pod-name> --tail=100

# show logs since a duration
kubectl logs <pod-name> --since=1h

# show logs from a previous (crashed) container
kubectl logs <pod-name> --previous

# show logs from a specific container in a multi-container pod
kubectl logs <pod-name> -c <container-name>

# show logs from all containers in a pod
kubectl logs <pod-name> --all-containers=true

# show logs for all pods matching a label
kubectl logs -l app=nginx

# show logs for all pods matching a label with timestamps
kubectl logs -l app=nginx --timestamps=true

Log rotation

kubelet manages container log rotation via two settings:
- containerLogMaxSize — max size per log file (default 10Mi)
- containerLogMaxFiles — max number of log files per container (default 5)
These can be set in the kubelet config file (/var/lib/kubelet/config.yaml)

crictl logs

On the node, you can also view container logs directly with crictl:

# view logs for a container
sudo crictl logs <container-id>

# follow logs
sudo crictl logs -f <container-id>

# show last N lines
sudo crictl logs --tail=50 <container-id>

Useful Commands

# quick check — is metrics-server working?
kubectl top nodes
kubectl top pods -A

# check metrics-server logs if not working
kubectl logs -n kube-system deployment/metrics-server

# query Metrics API directly
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | jq .

# view pod logs
kubectl logs <pod-name> -f --tail=100

# view previous container logs (after a crash)
kubectl logs <pod-name> --previous

# view logs for all pods with a label
kubectl logs -l app=nginx --all-containers=true

# check container logs on node
sudo crictl logs <container-id>

# check kubelet logs on node
journalctl -u kubelet --no-pager -n 100