Real-World Kubernetes
The basics -- Pods, Deployments, Services -- get you started. Running Kubernetes in production requires a dozen more patterns that the tutorials skip. This file covers the resources and configurations you will actually use when operating services at scale: namespaces for isolation, ConfigMaps and Secrets for configuration, Ingress for routing, resource management, health probes, autoscaling, rolling updates, and persistent storage.
Namespaces
Namespaces partition a cluster into virtual sub-clusters. They provide isolation for resources, access control boundaries, and resource quota scopes.
kubectl create namespace production
kubectl create namespace staging
kubectl get namespaces
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: production
Common namespace strategies:
By environment: production, staging, development
By team: team-payments, team-users, team-search
By application: app-frontend, app-backend, app-data
Combined: production-payments, staging-payments
Set a default namespace for your kubectl context so you do not have to type -n production on every command:
kubectl config set-context --current --namespace=production
ConfigMaps
ConfigMaps store non-sensitive configuration data as key-value pairs. They decouple configuration from container images, so you can change config without rebuilding.
apiVersion: v1
kind: ConfigMap
metadata:
name: api-config
namespace: production
data:
LOG_LEVEL: "info"
MAX_CONNECTIONS: "100"
FEATURE_NEW_UI: "true"
Use ConfigMaps as environment variables or as mounted files:
# As environment variables
spec:
containers:
- name: api-server
envFrom:
- configMapRef:
name: api-config
# As a mounted file
spec:
containers:
- name: api-server
volumeMounts:
- name: config-volume
mountPath: /etc/config
volumes:
- name: config-volume
configMap:
name: api-config
When a ConfigMap is mounted as a volume, Kubernetes can update the files without restarting the pod (eventually consistent, typically within a minute). When used as environment variables, a pod restart is required.
Secrets
Secrets store sensitive data: API keys, database passwords, TLS certificates. They are base64-encoded (not encrypted) by default. Always enable encryption at rest in your cluster.
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
namespace: production
type: Opaque
data:
username: YXBw # base64 of "app"
password: czNjcjN0 # base64 of "s3cr3t"
# Create a secret from the command line
kubectl create secret generic db-credentials \
--from-literal=username=app \
--from-literal=password=s3cr3t \
-n production
# Use in a pod
spec:
containers:
- name: api-server
env:
- name: DB_USERNAME
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
In practice, many teams use the External Secrets Operator to sync secrets from HashiCorp Vault, AWS Secrets Manager, or similar tools into Kubernetes Secrets. This avoids storing secrets in manifests.
Ingress
A Service of type LoadBalancer provisions a separate cloud load balancer for each service. At scale, this gets expensive. Ingress provides HTTP routing for multiple services behind a single load balancer.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
- app.example.com
secretName: tls-cert
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-server
port:
number: 80
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend
port:
number: 80
An Ingress resource does nothing on its own. You need an Ingress Controller running in the cluster. The most common options are:
nginx-ingress: Most widely used, reliable, well-documented
Traefik: Auto-configures from service annotations, good for simpler setups
AWS ALB Ingress: Native AWS integration, provisions ALBs from Ingress resources
Istio Gateway: Part of the Istio service mesh, more features but more complexity
Resource Limits & Requests
Requests are what a pod is guaranteed. Limits are the maximum a pod can use. The scheduler uses requests to decide which node a pod runs on. The kubelet uses limits to throttle (CPU) or kill (memory) pods that exceed them.
spec:
containers:
- name: api-server
resources:
requests:
cpu: "100m" # 100 millicores = 0.1 CPU core
memory: "128Mi" # 128 mebibytes
limits:
cpu: "500m" # Can burst up to 0.5 CPU core
memory: "512Mi" # Killed (OOMKilled) if exceeds 512Mi
What happens when:
CPU exceeds limit: Pod is throttled (slowed down, not killed)
Memory exceeds limit: Pod is OOMKilled and restarted
No requests set: Pod can be scheduled on any node, may starve others
No limits set: Pod can consume all node resources
Set requests based on typical usage and limits based on peak usage. Monitor actual consumption and adjust.
# Check actual resource usage
kubectl top pods -n production
Liveness & Readiness Probes
Probes tell Kubernetes whether a pod is healthy and whether it should receive traffic.
Liveness probe: Is the container still running correctly? If it fails, Kubernetes restarts the container. Use this to recover from deadlocks or corrupted state.
Readiness probe: Is the container ready to serve traffic? If it fails, Kubernetes removes the pod from the Service's endpoint list. Use this during startup (loading caches, warming connections) and when the pod is temporarily overloaded.
spec:
containers:
- name: api-server
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
Startup:
t=0s: Container starts
t=5s: Readiness probe begins. Fails (app still loading).
Pod not added to Service.
t=10s: Readiness probe succeeds. Pod receives traffic.
t=15s: Liveness probe begins. Succeeds.
Runtime:
Readiness fails -> Pod removed from Service (no new traffic)
Liveness fails 3x -> Container restarted
There is also a startup probe for containers that take a long time to start. It disables liveness and readiness checks until it succeeds.
Horizontal Pod Autoscaler
The HPA automatically adjusts the number of pod replicas based on observed metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
kubectl get hpa -n production
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
api-server-hpa Deployment/api-server 45%/70%, 60%/80% 2 20 3
The HPA requires the Metrics Server to be installed in the cluster. It checks metrics every 15 seconds by default and adjusts replicas to keep the target metric at the specified value.
Rolling Updates & Rollbacks
Deployments perform rolling updates by default. When you change the pod template (update the image tag, change environment variables), Kubernetes gradually replaces old pods with new ones.
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # At most 1 extra pod during update
maxUnavailable: 0 # Never reduce below desired replicas
# Update the image
kubectl set image deployment/api-server api-server=ghcr.io/org/api-server:v1.3.0 -n production
# Watch the rollout
kubectl rollout status deployment/api-server -n production
# View rollout history
kubectl rollout history deployment/api-server -n production
# Roll back to the previous version
kubectl rollout undo deployment/api-server -n production
# Roll back to a specific revision
kubectl rollout undo deployment/api-server --to-revision=3 -n production
Rolling update with maxSurge=1, maxUnavailable=0, replicas=3:
Step 1: Create 1 new pod (v1.3.0). Total: 3 old + 1 new = 4 pods.
Step 2: New pod ready. Terminate 1 old pod. Total: 2 old + 1 new = 3 pods.
Step 3: Create 1 new pod. Total: 2 old + 2 new = 4 pods.
Step 4: New pod ready. Terminate 1 old pod. Total: 1 old + 2 new = 3 pods.
Step 5: Create 1 new pod. Total: 1 old + 3 new = 4 pods.
Step 6: New pod ready. Terminate last old pod. Total: 3 new = 3 pods.
Zero downtime throughout the process.
Persistent Volumes
Containers are ephemeral. When a pod restarts, its filesystem is reset. Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) provide storage that survives pod restarts.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
namespace: production
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3
resources:
requests:
storage: 50Gi
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: production
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:16-alpine
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: postgres-data
Use StatefulSets (not Deployments) for stateful workloads that need stable network identities and persistent storage. Each replica gets its own PVC.
Common Pitfalls
- Missing readiness probes -- Without readiness probes, Kubernetes sends traffic to pods that are still starting up, causing errors during deployments
- Liveness probe too aggressive -- Setting
initialDelaySecondstoo low orfailureThresholdto 1 causes Kubernetes to restart healthy pods that are just slow to respond - No resource limits -- A memory leak in one pod can OOMKill every pod on the node. Always set memory limits.
- Storing state in Deployments -- Deployments are for stateless workloads. Use StatefulSets for databases and other stateful services.
- Secrets in Git -- Base64 is not encryption. Never commit Kubernetes Secret manifests with real credentials. Use External Secrets Operator or sealed-secrets.
- Ignoring Pod Disruption Budgets -- During node upgrades, all pods on a node are evicted. Without a PDB, your service can go down entirely.
Key Takeaways
- Namespaces provide isolation, access control, and resource quota boundaries
- ConfigMaps and Secrets separate configuration from code; use External Secrets Operator for real secret management
- Ingress routes HTTP traffic to multiple services through a single load balancer
- Always set resource requests and limits; requests determine scheduling, limits prevent resource starvation
- Readiness probes control traffic routing; liveness probes trigger restarts; both are essential for zero-downtime deployments
- The HPA scales replicas based on CPU, memory, or custom metrics
- Rolling updates with
maxUnavailable: 0ensure zero-downtime deployments - Use StatefulSets and PersistentVolumeClaims for stateful workloads