What is Kubernetes?
The Problem Kubernetes Solves
Docker Compose runs containers on one machine. When that machine dies, everything dies. When traffic spikes, you can't easily add more containers. When you deploy a new version, there's downtime.
Kubernetes solves this by running containers across a cluster of machines and managing them automatically:
- If a container crashes, K8s restarts it
- If a node (machine) fails, K8s reschedules the containers on other nodes
- When you deploy a new version, K8s replaces containers gradually (zero downtime)
- When traffic grows, K8s can scale containers up; when it drops, scale down
Architecture
A Kubernetes cluster has two types of machines:
Control Plane (Master)
The brain of the cluster. It doesn't run your app — it manages the cluster:
| Component | Role |
|---|---|
| API Server | The entry point for all K8s commands (kubectl talks to this) |
| etcd | Distributed key-value store holding all cluster state |
| Scheduler | Decides which node to place a new pod on |
| Controller Manager | Watches cluster state and reconciles it toward the desired state |
Worker Nodes
The machines that actually run your containers:
| Component | Role |
|---|---|
| kubelet | Agent on each node; communicates with the control plane |
| kube-proxy | Handles networking rules on the node |
| Container runtime | Runs containers (containerd, CRI-O) |
Core Objects
Pod
The smallest deployable unit in Kubernetes — one or more containers that share a network and storage. Usually one container per pod.
apiVersion: v1
kind: Pod
metadata:
name: my-api
spec:
containers:
- name: api
image: myapp:latest
ports:
- containerPort: 3000
Pods are ephemeral — you don't manage them directly. You use Deployments.
Deployment
A Deployment declares the desired state: run 3 replicas of this container image. K8s ensures that state is always met — if a pod dies, a new one is created.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api
spec:
replicas: 3
selector:
matchLabels:
app: my-api
template:
metadata:
labels:
app: my-api
spec:
containers:
- name: api
image: myapp:v2.1.0
ports:
- containerPort: 3000
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
Service
Pods get new IPs when they restart. A Service provides a stable IP and DNS name that routes traffic to whatever pods match its selector.
apiVersion: v1
kind: Service
metadata:
name: my-api
spec:
selector:
app: my-api # routes to pods with this label
ports:
- port: 80
targetPort: 3000
type: ClusterIP # internal only (use LoadBalancer to expose externally)
Service types:
ClusterIP— internal cluster traffic onlyNodePort— exposes on a port on every nodeLoadBalancer— provisions a cloud load balancer (AWS ALB on EKS)
kubectl — The K8s CLI
kubectl get pods # list pods
kubectl get deployments # list deployments
kubectl describe pod my-api-abc123 # detailed info about a pod
kubectl logs my-api-abc123 # pod logs
kubectl exec -it my-api-abc123 -- sh # shell into a pod
kubectl apply -f deployment.yaml # create/update resources
kubectl delete -f deployment.yaml # delete resources
kubectl scale deployment my-api --replicas=5 # scale up
kubectl rollout status deployment my-api # watch a rolling update
kubectl rollout undo deployment my-api # roll back to previous version
Desired State Reconciliation
The most important concept in Kubernetes is desired state. You declare what you want (3 replicas of my-api:v2) and K8s continuously reconciles the actual state toward the desired state.
If a pod crashes → K8s creates a new one. If a node dies → K8s reschedules those pods on other nodes. If you change the image → K8s rolls out new pods and terminates old ones gradually.
This is fundamentally different from imperative systems where you issue commands and hope they stick.
Getting Started Locally
The easiest way to run K8s locally:
# Docker Desktop — enable Kubernetes in settings
# or
minikube start
# Verify
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# minikube Ready control-plane 10s v1.28.0