5 min read
On this page

Deployment Strategies

Deploying code is the moment theory meets reality. The strategy you choose determines how much risk you take on, how fast you can roll back, and how many users are affected if something goes wrong. There is no single best strategy -- the right choice depends on your infrastructure, your traffic, and your tolerance for failure.

Blue-Green Deployment

Maintain two identical environments: blue and green. One serves production traffic while the other sits idle. Deploy to the idle environment, verify it works, then switch traffic.

Before deploy:
  Load Balancer -> Blue (v1.0, serving traffic)
                   Green (idle)

After deploy:
  Load Balancer -> Blue (v1.0, idle)
                   Green (v1.1, serving traffic)

Implementation

# Kubernetes blue-green with service selector
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
    version: green  # Switch this to blue for rollback
  ports:
    - port: 80
      targetPort: 8080

The switch is a single operation -- update the service selector or change the load balancer target. Rollback is equally simple: point traffic back to the previous environment.

When to Use Blue-Green

  • You need instant rollback capability
  • Your application cannot handle two versions running simultaneously
  • Database migrations are backward-compatible (both versions must work with the same schema)
  • You can afford to run two full environments

The Catch

You need double the infrastructure, at least during the transition. For large deployments, that cost is real. Also, database migrations complicate things -- if v1.1 changes the schema, rolling back to blue means v1.0 must still work with the modified database.

Canary Deployment

Route a small percentage of traffic to the new version. Monitor it. If metrics look good, increase the percentage. If something breaks, the blast radius is small.

Step 1:  1% traffic -> v1.1    99% traffic -> v1.0
Step 2: 10% traffic -> v1.1    90% traffic -> v1.0
Step 3: 50% traffic -> v1.1    50% traffic -> v1.0
Step 4: 100% traffic -> v1.1

Implementation with Nginx

# Nginx weighted upstream
upstream myapp {
    server 10.0.1.10:8080 weight=99;  # v1.0
    server 10.0.1.20:8080 weight=1;   # v1.1 canary
}

Implementation with Kubernetes

# Using Argo Rollouts for canary
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: myapp
spec:
  strategy:
    canary:
      steps:
        - setWeight: 1
        - pause: { duration: 5m }
        - setWeight: 10
        - pause: { duration: 10m }
        - setWeight: 50
        - pause: { duration: 15m }
        - setWeight: 100
      canaryMetrics:
        - name: error-rate
          successCondition: result[0] < 0.01
          provider:
            prometheus:
              address: http://prometheus:9090
              query: |
                sum(rate(http_errors_total{app="myapp",version="canary"}[5m]))
                /
                sum(rate(http_requests_total{app="myapp",version="canary"}[5m]))

When to Use Canary

  • You serve high traffic and need to limit blast radius
  • You have good observability (metrics, logging, alerting) to detect issues during rollout
  • Your application can tolerate two versions running simultaneously
  • You want automated rollback based on metrics

The Catch

Canary requires solid monitoring. If you cannot detect that the canary is failing, you will promote a broken version. It also requires traffic splitting infrastructure -- a load balancer or service mesh that supports weighted routing.

Rolling Deployment

Replace instances one at a time (or in small batches). At any point during the deploy, some instances run the old version and some run the new version.

4 instances, rolling update:
  Step 1: [v1.1] [v1.0] [v1.0] [v1.0]
  Step 2: [v1.1] [v1.1] [v1.0] [v1.0]
  Step 3: [v1.1] [v1.1] [v1.1] [v1.0]
  Step 4: [v1.1] [v1.1] [v1.1] [v1.1]

Implementation with Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 4
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    spec:
      containers:
        - name: myapp
          image: myapp:1.1
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5

maxUnavailable: 1 means at most one pod is down during the update. maxSurge: 1 means at most one extra pod is created. The readiness probe ensures traffic only reaches healthy instances.

When to Use Rolling

  • You run multiple replicas of a stateless service
  • Your application handles mixed-version traffic gracefully
  • You want a simple, well-supported deployment model (this is the Kubernetes default)
  • You do not need instant rollback

The Catch

During the rollout, users hit different versions. If the API changes between versions, some requests may fail. Rollback means doing another rolling update back to the old version, which is not instant.

Feature Flags

Separate deployment from release. Deploy the code to production with the feature hidden behind a flag. Enable the flag for specific users, percentages, or segments. This is not a deployment strategy per se -- it is a release strategy that works alongside any deployment method.

# Using a feature flag library
from featureflags import client

def get_search_results(query, user):
    if client.is_enabled("new-search-algorithm", user=user):
        return new_search(query)
    else:
        return legacy_search(query)

Flag Configuration

# Feature flag configuration
flags:
  new-search-algorithm:
    enabled: true
    rules:
      - segments: [internal-users]
        percentage: 100
      - segments: [beta-users]
        percentage: 50
      - segments: [all-users]
        percentage: 0

When to Use Feature Flags

  • You want to decouple deploy from release
  • You need to test features with specific user segments
  • You want an instant kill switch that does not require a deployment
  • You are running A/B tests or gradual rollouts of product features

The Catch

Feature flags accumulate. Every flag is a branch in your code. If you do not clean up old flags, you end up with a codebase full of dead paths. Establish a process: when a flag reaches 100%, remove it within two weeks.

Rollback

Regardless of your deployment strategy, you need a fast path back to the last known good state. Rollback should be one command, not a 15-step procedure.

# Kubernetes rollback
kubectl rollout undo deployment/myapp

# Docker tag-based rollback
docker pull registry.example.com/myapp:v1.0
kubectl set image deployment/myapp myapp=registry.example.com/myapp:v1.0

# Argo Rollouts abort (automatic rollback during canary)
kubectl argo rollouts abort myapp

Rollback Requirements

  • Immutable artifacts. Every version is a tagged image or binary. You never rebuild for rollback.
  • Backward-compatible migrations. The database must work with both the current and previous version.
  • Tested rollback path. If you have never rolled back, your rollback process does not work. Test it.

Choosing a Strategy

Factor Blue-Green Canary Rolling Feature Flags
Rollback speed Instant Fast (reduce weight) Slow (re-roll) Instant (toggle)
Infrastructure cost 2x 1x + small canary 1x + surge 1x
Complexity Low Medium Low Medium
Observability required Low High Medium Medium
Mixed-version traffic No Yes Yes Yes (in-code)
Blast radius control All-or-nothing Precise Limited Precise

For most teams starting out: rolling deployments with feature flags. You get Kubernetes-native simplicity with the ability to control feature exposure independently from deployment.

For high-traffic production systems: canary with automated metric-based promotion. The observability investment pays for itself on the first prevented outage.

Common Pitfalls

  • No rollback plan. If your only rollback is "fix forward," you will have a long outage someday.
  • Database migrations that break rollback. Dropping a column means the old version cannot run. Use expand-and-contract migrations instead.
  • Canary without monitoring. A canary with no one watching it is just a slow rollout.
  • Feature flag sprawl. Hundreds of stale flags make the codebase unpredictable. Set expiration dates.
  • Testing only the happy path. Test what happens when the deploy fails halfway. Test what happens during rollback.
  • Manual deployment steps. If your deployment requires someone to remember to run a script, it will eventually be forgotten. Automate everything.

Key Takeaways

  • Blue-green gives you instant rollback at the cost of double infrastructure.
  • Canary limits blast radius by gradually shifting traffic, but requires strong observability.
  • Rolling deployments are the simplest model for stateless services behind a load balancer.
  • Feature flags separate deployment from release, giving you precise control over who sees what.
  • Rollback must be one command. Test it before you need it.
  • No strategy eliminates risk entirely. The goal is to make failures small, detectable, and reversible.