Kubernetes Basics

Kubernetes manages containers at scale. You tell it what you want -- three replicas of this container, a load balancer in front of them, restart if they crash -- and Kubernetes makes it happen. The mental model is declarative: you describe the desired state in a manifest file, submit it to the cluster, and the control plane continuously works to reconcile reality with your declaration.

Why Kubernetes Exists

Running one container on one server is simple. Running hundreds of containers across dozens of servers is not. You need to answer questions that Docker alone cannot:

Which server has enough CPU and memory to run this container?
What happens when a server dies? Who restarts the containers?
How do containers find each other across servers?
How do you update a service without downtime?
How do you scale up when traffic spikes and scale down when it drops?

Kubernetes answers all of these. It was originally developed at Google, based on their internal system Borg, and donated to the Cloud Native Computing Foundation (CNCF) in 2014. It is now the industry standard for container orchestration.

The Control Plane

A Kubernetes cluster has two parts: the control plane and the worker nodes.

Control Plane:
  +-------------------+
  | API Server        |  <-- All communication goes through here
  | etcd              |  <-- Distributed key-value store (cluster state)
  | Scheduler         |  <-- Assigns pods to nodes
  | Controller Manager|  <-- Runs control loops (reconciliation)
  +-------------------+

Worker Nodes:
  +-------------------+  +-------------------+
  | kubelet           |  | kubelet           |
  | Container Runtime |  | Container Runtime |
  | kube-proxy        |  | kube-proxy        |
  | [Pod] [Pod] [Pod] |  | [Pod] [Pod]       |
  +-------------------+  +-------------------+

The API Server is the single entry point. Every kubectl command, every internal component, and every controller talks to the API server. It validates requests and writes state to etcd.

etcd is the source of truth. It stores the entire cluster state: what pods should be running, their current status, configuration data, and secrets.

The Scheduler watches for newly created pods with no assigned node, evaluates which node has enough resources, and binds the pod to a node.

The Controller Manager runs control loops. Each controller watches a specific resource type and takes action to move the current state toward the desired state. The Deployment controller, for example, ensures the right number of pod replicas are running.

Core Resources

Pods

A pod is the smallest deployable unit in Kubernetes. It is one or more containers that share a network namespace (same IP address) and storage volumes. In practice, most pods contain a single container.

apiVersion: v1
kind: Pod
metadata:
  name: my-app
  labels:
    app: my-app
spec:
  containers:
    - name: my-app
      image: ghcr.io/org/my-app:v1.0.0
      ports:
        - containerPort: 8080

You almost never create pods directly. You create higher-level resources (Deployments) that manage pods for you.

Deployments

A Deployment declares the desired state for a set of pods: which image to run, how many replicas, and how to update them. The Deployment controller creates a ReplicaSet, which in turn creates the pods.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
  labels:
    app: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
        - name: api-server
          image: ghcr.io/org/api-server:v1.2.3
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "512Mi"

Deployment
  └── ReplicaSet
        ├── Pod (api-server-abc12)
        ├── Pod (api-server-def34)
        └── Pod (api-server-ghi56)

When you update the Deployment (change the image tag, for example), Kubernetes creates a new ReplicaSet, gradually scales it up, and scales down the old one. This is a rolling update.

ReplicaSets

A ReplicaSet ensures a specified number of pod replicas are running at any time. If a pod crashes, the ReplicaSet creates a new one. You rarely interact with ReplicaSets directly -- Deployments manage them.

Services

A Service provides a stable network endpoint for a set of pods. Pods are ephemeral: they get new IP addresses when they restart. A Service gives them a permanent DNS name and load-balances traffic across healthy pods.

apiVersion: v1
kind: Service
metadata:
  name: api-server
spec:
  selector:
    app: api-server
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

Service types:

ClusterIP (default):  Internal only. Other pods reach it via api-server.default.svc.cluster.local
NodePort:             Exposes the service on each node's IP at a static port (30000-32767)
LoadBalancer:         Provisions a cloud load balancer (AWS ALB, GCP LB, etc.)

Inside the cluster, any pod can reach the api-server Service at http://api-server:80. The Service routes traffic to one of the three pods in the Deployment.

Declarative vs Imperative

Kubernetes supports both styles, but declarative is the standard for production.

# Imperative: tell Kubernetes what to do step by step
kubectl create deployment api-server --image=ghcr.io/org/api-server:v1.2.3
kubectl scale deployment api-server --replicas=3
kubectl expose deployment api-server --port=80 --target-port=8080

# Declarative: describe the desired state, apply it
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

The declarative approach is better because:

Manifest files are version-controlled (Git)
Changes are reviewable in pull requests
The cluster state is reproducible -- apply the same manifests to a new cluster and get the same result
Drift detection: if someone manually changes something, re-applying the manifests corrects it

The Manifest File

A Kubernetes manifest is a YAML file with four required sections:

apiVersion: apps/v1        # API group and version
kind: Deployment            # Resource type
metadata:                   # Name, labels, annotations
  name: api-server
  namespace: production
  labels:
    app: api-server
    team: backend
spec:                       # Desired state (varies by resource type)
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
        - name: api-server
          image: ghcr.io/org/api-server:v1.2.3

Multiple resources can be defined in a single file, separated by ---:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  # ... deployment spec ...
---
apiVersion: v1
kind: Service
metadata:
  name: api-server
spec:
  # ... service spec ...

kubectl Basics

# Cluster info
kubectl cluster-info
kubectl get nodes

# Working with resources
kubectl get pods                          # List pods in default namespace
kubectl get pods -n production            # List pods in a specific namespace
kubectl get pods -o wide                  # More detail (node, IP)
kubectl get all                           # Pods, services, deployments, replicasets

# Apply and delete
kubectl apply -f deployment.yaml          # Create or update resources
kubectl delete -f deployment.yaml         # Delete resources defined in file

# Debugging
kubectl describe pod api-server-abc12     # Detailed info, events, conditions
kubectl logs api-server-abc12             # Container logs
kubectl logs -f api-server-abc12          # Stream logs
kubectl exec -it api-server-abc12 -- /bin/sh  # Shell into a container

# Quick checks
kubectl top pods                          # CPU and memory usage
kubectl get events --sort-by=.lastTimestamp

A Minimal Working Example

Deploy a simple web application with three replicas and a service:

# app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: hello-app
  template:
    metadata:
      labels:
        app: hello-app
    spec:
      containers:
        - name: hello-app
          image: gcr.io/google-samples/hello-app:2.0
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: "50m"
              memory: "64Mi"
            limits:
              cpu: "100m"
              memory: "128Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: hello-app
spec:
  selector:
    app: hello-app
  ports:
    - port: 80
      targetPort: 8080
  type: LoadBalancer

kubectl apply -f app.yaml
kubectl get pods -w              # Watch pods come up
kubectl get service hello-app    # Get the external IP

NAME        TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)
hello-app   LoadBalancer   10.96.45.123   34.120.67.89    80:31234/TCP

The application is now running on three pods, behind a load balancer, accessible at the external IP.

Common Pitfalls

Not setting resource requests and limits -- Without them, a single pod can consume all resources on a node, starving other pods. Always set both requests (guaranteed) and limits (maximum).
Using latest image tags -- Kubernetes caches images. If you push a new image with the latest tag, pods might not pull it. Use specific version tags.
Ignoring pod disruption budgets -- During node maintenance, Kubernetes evicts pods. Without a PodDisruptionBudget, all replicas of a service can be evicted simultaneously.
Imperative management in production -- Running kubectl edit or kubectl scale directly means your cluster state diverges from your Git manifests. Always use kubectl apply with version-controlled files.
Skipping namespaces -- Putting everything in the default namespace makes it hard to manage resources, set quotas, or control access as the cluster grows.
Over-engineering for a single service -- If you have one service with predictable load, Kubernetes adds operational complexity without proportional benefit. Consider simpler alternatives first.

Key Takeaways

Kubernetes is a declarative system: you describe what you want, and the control plane makes it happen
The core resources are Pods (running containers), Deployments (managing pod replicas), and Services (stable network endpoints)
Deployments manage ReplicaSets, which manage Pods -- you almost never create Pods directly
Manifests are YAML files checked into Git. Use kubectl apply for all changes.
Always set resource requests and limits, use specific image tags, and organize resources into namespaces
The control plane (API server, etcd, scheduler, controller manager) continuously reconciles desired state with actual state