Kubernetes Basics
Kubernetes manages containers at scale. You tell it what you want -- three replicas of this container, a load balancer in front of them, restart if they crash -- and Kubernetes makes it happen. The mental model is declarative: you describe the desired state in a manifest file, submit it to the cluster, and the control plane continuously works to reconcile reality with your declaration.
Why Kubernetes Exists
Running one container on one server is simple. Running hundreds of containers across dozens of servers is not. You need to answer questions that Docker alone cannot:
- Which server has enough CPU and memory to run this container?
- What happens when a server dies? Who restarts the containers?
- How do containers find each other across servers?
- How do you update a service without downtime?
- How do you scale up when traffic spikes and scale down when it drops?
Kubernetes answers all of these. It was originally developed at Google, based on their internal system Borg, and donated to the Cloud Native Computing Foundation (CNCF) in 2014. It is now the industry standard for container orchestration.
The Control Plane
A Kubernetes cluster has two parts: the control plane and the worker nodes.
Control Plane:
+-------------------+
| API Server | <-- All communication goes through here
| etcd | <-- Distributed key-value store (cluster state)
| Scheduler | <-- Assigns pods to nodes
| Controller Manager| <-- Runs control loops (reconciliation)
+-------------------+
Worker Nodes:
+-------------------+ +-------------------+
| kubelet | | kubelet |
| Container Runtime | | Container Runtime |
| kube-proxy | | kube-proxy |
| [Pod] [Pod] [Pod] | | [Pod] [Pod] |
+-------------------+ +-------------------+
The API Server is the single entry point. Every kubectl command, every internal component, and every controller talks to the API server. It validates requests and writes state to etcd.
etcd is the source of truth. It stores the entire cluster state: what pods should be running, their current status, configuration data, and secrets.
The Scheduler watches for newly created pods with no assigned node, evaluates which node has enough resources, and binds the pod to a node.
The Controller Manager runs control loops. Each controller watches a specific resource type and takes action to move the current state toward the desired state. The Deployment controller, for example, ensures the right number of pod replicas are running.
Core Resources
Pods
A pod is the smallest deployable unit in Kubernetes. It is one or more containers that share a network namespace (same IP address) and storage volumes. In practice, most pods contain a single container.
apiVersion: v1
kind: Pod
metadata:
name: my-app
labels:
app: my-app
spec:
containers:
- name: my-app
image: ghcr.io/org/my-app:v1.0.0
ports:
- containerPort: 8080
You almost never create pods directly. You create higher-level resources (Deployments) that manage pods for you.
Deployments
A Deployment declares the desired state for a set of pods: which image to run, how many replicas, and how to update them. The Deployment controller creates a ReplicaSet, which in turn creates the pods.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
app: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api-server
image: ghcr.io/org/api-server:v1.2.3
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
Deployment
└── ReplicaSet
├── Pod (api-server-abc12)
├── Pod (api-server-def34)
└── Pod (api-server-ghi56)
When you update the Deployment (change the image tag, for example), Kubernetes creates a new ReplicaSet, gradually scales it up, and scales down the old one. This is a rolling update.
ReplicaSets
A ReplicaSet ensures a specified number of pod replicas are running at any time. If a pod crashes, the ReplicaSet creates a new one. You rarely interact with ReplicaSets directly -- Deployments manage them.
Services
A Service provides a stable network endpoint for a set of pods. Pods are ephemeral: they get new IP addresses when they restart. A Service gives them a permanent DNS name and load-balances traffic across healthy pods.
apiVersion: v1
kind: Service
metadata:
name: api-server
spec:
selector:
app: api-server
ports:
- port: 80
targetPort: 8080
type: ClusterIP
Service types:
ClusterIP (default): Internal only. Other pods reach it via api-server.default.svc.cluster.local
NodePort: Exposes the service on each node's IP at a static port (30000-32767)
LoadBalancer: Provisions a cloud load balancer (AWS ALB, GCP LB, etc.)
Inside the cluster, any pod can reach the api-server Service at http://api-server:80. The Service routes traffic to one of the three pods in the Deployment.
Declarative vs Imperative
Kubernetes supports both styles, but declarative is the standard for production.
# Imperative: tell Kubernetes what to do step by step
kubectl create deployment api-server --image=ghcr.io/org/api-server:v1.2.3
kubectl scale deployment api-server --replicas=3
kubectl expose deployment api-server --port=80 --target-port=8080
# Declarative: describe the desired state, apply it
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
The declarative approach is better because:
- Manifest files are version-controlled (Git)
- Changes are reviewable in pull requests
- The cluster state is reproducible -- apply the same manifests to a new cluster and get the same result
- Drift detection: if someone manually changes something, re-applying the manifests corrects it
The Manifest File
A Kubernetes manifest is a YAML file with four required sections:
apiVersion: apps/v1 # API group and version
kind: Deployment # Resource type
metadata: # Name, labels, annotations
name: api-server
namespace: production
labels:
app: api-server
team: backend
spec: # Desired state (varies by resource type)
replicas: 3
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api-server
image: ghcr.io/org/api-server:v1.2.3
Multiple resources can be defined in a single file, separated by ---:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
# ... deployment spec ...
---
apiVersion: v1
kind: Service
metadata:
name: api-server
spec:
# ... service spec ...
kubectl Basics
# Cluster info
kubectl cluster-info
kubectl get nodes
# Working with resources
kubectl get pods # List pods in default namespace
kubectl get pods -n production # List pods in a specific namespace
kubectl get pods -o wide # More detail (node, IP)
kubectl get all # Pods, services, deployments, replicasets
# Apply and delete
kubectl apply -f deployment.yaml # Create or update resources
kubectl delete -f deployment.yaml # Delete resources defined in file
# Debugging
kubectl describe pod api-server-abc12 # Detailed info, events, conditions
kubectl logs api-server-abc12 # Container logs
kubectl logs -f api-server-abc12 # Stream logs
kubectl exec -it api-server-abc12 -- /bin/sh # Shell into a container
# Quick checks
kubectl top pods # CPU and memory usage
kubectl get events --sort-by=.lastTimestamp
A Minimal Working Example
Deploy a simple web application with three replicas and a service:
# app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-app
spec:
replicas: 3
selector:
matchLabels:
app: hello-app
template:
metadata:
labels:
app: hello-app
spec:
containers:
- name: hello-app
image: gcr.io/google-samples/hello-app:2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "100m"
memory: "128Mi"
---
apiVersion: v1
kind: Service
metadata:
name: hello-app
spec:
selector:
app: hello-app
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
kubectl apply -f app.yaml
kubectl get pods -w # Watch pods come up
kubectl get service hello-app # Get the external IP
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
hello-app LoadBalancer 10.96.45.123 34.120.67.89 80:31234/TCP
The application is now running on three pods, behind a load balancer, accessible at the external IP.
Common Pitfalls
- Not setting resource requests and limits -- Without them, a single pod can consume all resources on a node, starving other pods. Always set both requests (guaranteed) and limits (maximum).
- Using
latestimage tags -- Kubernetes caches images. If you push a new image with thelatesttag, pods might not pull it. Use specific version tags. - Ignoring pod disruption budgets -- During node maintenance, Kubernetes evicts pods. Without a PodDisruptionBudget, all replicas of a service can be evicted simultaneously.
- Imperative management in production -- Running
kubectl editorkubectl scaledirectly means your cluster state diverges from your Git manifests. Always usekubectl applywith version-controlled files. - Skipping namespaces -- Putting everything in the
defaultnamespace makes it hard to manage resources, set quotas, or control access as the cluster grows. - Over-engineering for a single service -- If you have one service with predictable load, Kubernetes adds operational complexity without proportional benefit. Consider simpler alternatives first.
Key Takeaways
- Kubernetes is a declarative system: you describe what you want, and the control plane makes it happen
- The core resources are Pods (running containers), Deployments (managing pod replicas), and Services (stable network endpoints)
- Deployments manage ReplicaSets, which manage Pods -- you almost never create Pods directly
- Manifests are YAML files checked into Git. Use
kubectl applyfor all changes. - Always set resource requests and limits, use specific image tags, and organize resources into namespaces
- The control plane (API server, etcd, scheduler, controller manager) continuously reconciles desired state with actual state