Introduction
This article is part of an ongoing series designed to help you prepare for the Certified Kubernetes Application Developer (CKAD) exam through small, focused labs.
This article continues our exploration of the “Application Observability and Maintenance” domain. We’re covering the requirement:
Use built-in CLI tools to monitor Kubernetes applications
To keep everything brief and focused, this article focuses specifically on monitoring tools. Container logs and debugging techniques will be covered in separate dedicated articles.
As usual, you can start from the beginning of the series here: CKAD Preparation — What is Kubernetes.
Also, a quick personal note: I’ve been quite busy lately with work deadlines and commitments, so publishing has slowed down.. quite a bit — but I’m planning to get back on track and share new labs very soon!
Prerequisites
A running Kubernetes cluster (like Minikube, Docker Desktop, or use one of the KillerCoda Kubernetes Playgrounds) and basic familiarity with Pods, Deployments, and Services.
Also, for this lab you’ll need the metrics-server installed in your cluster.
If you’re using a KillerCoda playground:
k apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
If you’re using Minikube, enable the metrics-server addon:
minikube addons enable metrics-server
For other clusters, follow the metrics-server installation guide.
Once the metrics-server is installed, verify it’s working:
k top no
If you get error: Metrics API not available. Inspect the logs of the metrics-server:
k logs -n kube-system deployment/metrics-server
If you have a TLS certificate validation error like this:
x509: cannot validate certificate for 172.30.1.2 because it doesn't contain any IP SANs" node="controlplane"
You may need to skip TLS verification depending on the environment:
You can patch the metrics-server deployment to add the --kubelet-insecure-tls argument:
k patch deployment metrics-server -n kube-system --type='json' -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'
After a few moments, try k top no and k top po again. You should see resource usage metrics.
Getting the Resources
Clone the lab repository and navigate to this article’s folder:
git clone https://github.com/SupaaHiro/schwifty-lab.git
cd schwifty-lab/blog-posts/20261231-ckad
Understanding Kubernetes Built-in CLI Monitoring Tools
Kubernetes provides several built-in kubectl commands for monitoring and observability:
- kubectl get: List and inspect resources with various output formats
- kubectl describe: Get detailed information about resources and their events
- kubectl get events: Monitor cluster-wide or namespace events
- kubectl top: View resource utilization (CPU/Memory) for nodes and pods
- kubectl port-forward: Access applications locally for monitoring and metrics
Let’s explore each tool with hands-on examples that demonstrate real-world monitoring scenarios.
Hands-On Challenge: Mastering Kubernetes CLI Monitoring Tools
Step 1: Kubectl Get - Resource Status and Listing
The kubectl get command is your primary tool for quickly viewing the status of resources in your cluster. Let’s explore its various capabilities.
Create a sample application with multiple resources:
Create the file manifests/01-sample-app.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
labels:
app: web-app
env: production
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
env: production
version: v1
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
resources:
limits:
memory: 128Mi
cpu: 100m
requests:
memory: 64Mi
cpu: 50m
---
apiVersion: v1
kind: Service
metadata:
name: web-app-svc
labels:
app: web-app
spec:
selector:
app: web-app
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-app
labels:
app: api-app
env: production
spec:
replicas: 2
selector:
matchLabels:
app: api-app
template:
metadata:
labels:
app: api-app
env: production
version: v2
spec:
containers:
- name: api
image: httpd:2.4
ports:
- containerPort: 80
resources:
limits:
memory: 128Mi
cpu: 100m
requests:
memory: 64Mi
cpu: 50m
---
apiVersion: v1
kind: Service
metadata:
name: api-app-svc
labels:
app: api-app
spec:
selector:
app: api-app
ports:
- protocol: TCP
port: 8080
targetPort: 80
type: ClusterIP
Apply the manifest:
k apply -f manifests/01-sample-app.yaml
Wait for the deployments to be ready:
k rollout status deployment/web-app
k rollout status deployment/api-app
Basic kubectl get commands:
View all pods:
k get po
View pods with more details:
k get po -o wide
This shows additional columns like IP address, node, nominated node, and readiness gates.
Filter by labels:
k get po -l app=web-app
k get po -l env=production
k get po -l 'app in (web-app,api-app)'
View multiple resource types:
k get deployments,services,pods
Watch resources in real-time:
k get po --watch
Open another terminal and scale a deployment to see changes:
k scale deployment/web-app --replicas=5
Press Ctrl+C to stop watching.
Custom output formats:
JSON format (useful for automation):
k get po -o json | head -n 30
YAML format:
k get po <pod-name> -o yaml
Custom columns:
k get po -o custom-columns=NAME:.metadata.name,STATUS:.status.phase,NODE:.spec.nodeName,IP:.status.podIP
Extract specific fields with jsonpath:
k get po -o jsonpath='{.items[*].metadata.name}'
Get pod names and their container images:
k get po -o custom-columns=POD:.metadata.name,CONTAINERS:.spec.containers[*].name,IMAGES:.spec.containers[*].image
Sort output:
k get po --sort-by=.metadata.creationTimestamp
k get po --sort-by=.status.startTime
Show labels:
k get po --show-labels
k get po -L app,env,version
Step 2: Kubectl Describe - Deep Resource Inspection
The describe command provides comprehensive information about resources, including configuration, status, and events.
Describe a deployment:
k describe deploy web-app
Look for these important sections:
- Replicas: Desired, current, updated, available
- StrategyType: Deployment strategy (RollingUpdate, Recreate)
- Pod Template: Container specifications
- Conditions: Deployment health status
- Events: Recent operations
Describe a pod:
POD_NAME=$(k get po -l app=web-app -o jsonpath='{.items[0].metadata.name}')
k describe po $POD_NAME
Key sections to examine:
- Labels and Annotations: Metadata attached to the pod
- Status: Current pod phase (Pending, Running, Succeeded, Failed)
- IP: Pod’s assigned IP address
- Controlled By: Parent resource (ReplicaSet, DaemonSet, etc.)
- Containers: State, image, ports, resource limits/requests
- Conditions: PodScheduled, Initialized, ContainersReady, Ready
- Events: Chronological list of operations
Describe a service:
k describe svc web-app-svc
Important information:
- Type: ClusterIP, NodePort, LoadBalancer
- IP: Cluster IP address
- Port: Port mappings
- Endpoints: Pod IPs that the service routes to
- Session Affinity: Whether client sessions stick to the same pod
Let’s create a scenario with issues to see how describe helps troubleshoot:
Create the file manifests/02-problematic-app.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: problematic-app
spec:
replicas: 2
selector:
matchLabels:
app: problematic-app
template:
metadata:
labels:
app: problematic-app
spec:
containers:
- name: app
image: nginx:invalid-tag-9999
ports:
- containerPort: 80
resources:
limits:
memory: 128Mi
cpu: 100m
requests:
memory: 64Mi
cpu: 50m
Apply it:
k apply -f manifests/02-problematic-app.yaml
Check the pods:
k get po -l app=problematic-app
You’ll see ImagePullBackOff or ErrImagePull status. Describe a failing pod:
k describe pod -l app=problematic-app | grep -A 10 Events
The events clearly show “Failed to pull image” and the specific error, making troubleshooting straightforward.
Delete the problematic deployment:
k delete -f manifests/02-problematic-app.yaml
Step 3: Kubectl Get Events - Cluster-Wide Event Monitoring
Events provide a timeline of what’s happening in your cluster. They’re invaluable for understanding the sequence of operations and identifying issues.
View all events in the current namespace:
k get events
Sort events by timestamp (most recent first):
k get events --sort-by='.lastTimestamp'
Filter events by object type:
k get events --field-selector involvedObject.kind=Pod
k get events --field-selector involvedObject.kind=Deployment
Filter events by specific object:
k get events --field-selector involvedObject.name=web-app
Filter by event type:
k get events --field-selector type=Warning
k get events --field-selector type=Normal
Watch events in real-time:
k get events --watch
In another terminal, perform some operations:
k scale deployment/web-app --replicas=1
k scale deployment/web-app --replicas=3
You’ll see events like:
- ScalingReplicaSet
- SuccessfulCreate
- Started
Combine filters:
k get events --field-selector type=Warning,involvedObject.kind=Pod --sort-by='.lastTimestamp'
Let’s create a scenario that generates various events:
Create the file manifests/03-resource-constraints.yaml:
apiVersion: v1
kind: ResourceQuota
metadata:
name: namespace-quota
spec:
hard:
requests.cpu: "2"
requests.memory: 2Gi
limits.cpu: "4"
limits.memory: 4Gi
pods: "10"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: resource-intensive
spec:
replicas: 8
selector:
matchLabels:
app: resource-intensive
template:
metadata:
labels:
app: resource-intensive
spec:
containers:
- name: app
image: nginx:1.27
resources:
requests:
memory: 512Mi
cpu: 500m
limits:
memory: 1Gi
cpu: 1000m
Apply it:
k apply -f manifests/03-resource-constraints.yaml
Check the events:
k get events --sort-by='.lastTimestamp' | grep -i quota
You’ll see events about exceeding resource quotas. Check how many pods were created:
k get po -l app=resource-intensive
Describe the ReplicaSet to see why some pods weren’t created:
k describe rs -l app=resource-intensive
Delete the resource-intensive deployment and quota:
k delete -f manifests/03-resource-constraints.yaml
Step 4: Kubectl Top - Resource Utilization Monitoring
The top command shows real-time CPU and memory usage. It requires metrics-server to be installed.
Create pods with varying resource usage patterns:
Create the file manifests/04-resource-demo.yaml:
apiVersion: v1
kind: Pod
metadata:
name: idle-pod
labels:
app: resource-demo
load: idle
spec:
containers:
- name: app
image: busybox:1.36
command: ["sh", "-c", "while true; do sleep 30; done"]
resources:
limits:
memory: 64Mi
cpu: 50m
requests:
memory: 32Mi
cpu: 25m
---
apiVersion: v1
kind: Pod
metadata:
name: cpu-intensive
labels:
app: resource-demo
load: high-cpu
spec:
containers:
- name: app
image: busybox:1.36
command:
- sh
- -c
- |
while true; do
echo "Computing..." > /dev/null
done
resources:
limits:
memory: 64Mi
cpu: 200m
requests:
memory: 32Mi
cpu: 100m
---
apiVersion: v1
kind: Pod
metadata:
name: memory-intensive
labels:
app: resource-demo
load: high-memory
spec:
containers:
- name: app
image: busybox:1.36
command:
- sh
- -c
- |
dd if=/dev/zero of=/tmp/data bs=2M count=50
while true; do sleep 10; done
resources:
limits:
memory: 128Mi
cpu: 50m
requests:
memory: 64Mi
cpu: 25m
---
apiVersion: v1
kind: Pod
metadata:
name: balanced-load
labels:
app: resource-demo
load: balanced
spec:
containers:
- name: app
image: nginx:1.27
resources:
limits:
memory: 128Mi
cpu: 100m
requests:
memory: 64Mi
cpu: 50m
Apply the manifest:
k apply -f manifests/04-resource-demo.yaml
Wait for all pods to be running:
k get po -l app=resource-demo
View node resource usage:
k top no
This shows:
- CPU usage (cores and percentage)
- Memory usage (bytes and percentage)
View pod resource usage:
k get po
Filter by labels:
k get po -l app=resource-demo
Sort by CPU usage:
k get po --sort-by=cpu
Sort by memory usage:
k get po --sort-by=memory
Show resource usage for all containers:
k get po --containers
Compare actual usage with requests and limits:
Create a simple script to compare:
echo "=== Resource Usage vs Requests/Limits ==="
printf "%-30s %-12s %-12s %-15s %-15s %-15s %-15s\n" \
"POD" "CPU_USED" "MEM_USED" "CPU_REQUEST" "MEM_REQUEST" "CPU_LIMIT" "MEM_LIMIT"
printf "%-30s %-12s %-12s %-15s %-15s %-15s %-15s\n" \
"----------------------------" "----------" "----------" \
"------------" "------------" "----------" "----------"
k get po -l app=resource-demo --no-headers | while read pod cpu mem; do
req_cpu=$(k get po "$pod" -o jsonpath='{.spec.containers[0].resources.requests.cpu}')
req_mem=$(k get po "$pod" -o jsonpath='{.spec.containers[0].resources.requests.memory}')
lim_cpu=$(k get po "$pod" -o jsonpath='{.spec.containers[0].resources.limits.cpu}')
lim_mem=$(k get po "$pod" -o jsonpath='{.spec.containers[0].resources.limits.memory}')
printf "%-30s %-12s %-12s %-15s %-15s %-15s %-15s\n" \
"$pod" "$cpu" "$mem" "$req_cpu" "$req_mem" "$lim_cpu" "$lim_mem"
done
It will output something like:
=== Resource Usage vs Requests/Limits ===
POD CPU_USED MEM_USED CPU_REQUEST MEM_REQUEST CPU_LIMIT MEM_LIMIT
---------------------------- ---------- ---------- ------------ ------------ ---------- ----------
balanced-load 1/1 Running 0 6m51s 50m 64Mi 100m 128Mi
cpu-intensive 1/1 Running 0 6m51s 100m 32Mi 200m 64Mi
idle-pod 1/1 Running 0 6m51s 25m 32Mi 50m 64Mi
memory-intensive 1/1 Running 0 3m17s 25m 64Mi 50m 128Mi
This helps identify potential issues, like pods approaching their limits (OOM or throttling) or being over/under-provisioned.
Monitor over time:
watch -n 2 'kubectl get po -l app=resource-demo'
Press Ctrl+C to stop watching.
Wrapping Up: What We’ve Covered
In this article, we explored Kubernetes built-in CLI monitoring tools as part of the “Application Observability and Maintenance” domain for CKAD preparation.
We focused specifically on monitoring capabilities, covering both theoretical concepts and extensive practical implementations:
- kubectl get: Resource listing and status checking with custom output formats, label filtering, and real-time watching
- kubectl describe: Deep resource inspection including configuration details, status conditions, and event history
- kubectl get events: Cluster-wide event monitoring with filtering by type, object, and timestamp for understanding operational sequences
- kubectl top: Real-time resource utilization monitoring for nodes and pods, essential for capacity planning and performance analysis
In the next article, we will dive into container logs analysis and debugging techniques using kubectl logs, kubectl exec, and kubectl debug.
Final Cleanup
To clean up all resources created in this lab:
k delete -f manifests/
If you enabled metrics-server on Minikube specifically for this lab, you can disable it:
minikube addons disable metrics-server