5 min read
On this page

Cloud-Native Architecture

The Twelve-Factor App

The twelve-factor methodology defines best practices for building cloud-native applications that are portable, scalable, and maintainable.

| Factor | Principle | Cloud Application | |--------|-----------|-------------------| | I. Codebase | One codebase, many deploys | Git repo with CI/CD pipelines | | II. Dependencies | Explicitly declare and isolate | Package managers, container images | | III. Config | Store config in environment | Env vars, Parameter Store, Secrets Manager | | IV. Backing Services | Treat as attached resources | RDS, ElastiCache, SQS as swappable URLs | | V. Build, Release, Run | Strictly separate stages | CI builds image, CD deploys to env | | VI. Processes | Execute as stateless processes | No sticky sessions, external session store | | VII. Port Binding | Export services via port binding | Container exposes port, LB routes traffic | | VIII. Concurrency | Scale out via the process model | Horizontal scaling, not vertical | | IX. Disposability | Fast startup and graceful shutdown | SIGTERM handling, health checks | | X. Dev/Prod Parity | Keep environments similar | IaC ensures identical infrastructure | | XI. Logs | Treat logs as event streams | Write to stdout, aggregate externally | | XII. Admin Processes | Run admin tasks as one-off processes | Kubernetes Jobs, Lambda invocations |

Microservices in the Cloud

Decomposition Strategies

Monolith                          Microservices
┌─────────────────┐               ┌──────┐ ┌──────┐ ┌──────┐
│  User Module    │               │ User │ │Order │ │ Pay  │
│  Order Module   │    ──────►    │ Svc  │ │ Svc  │ │ Svc  │
│  Payment Module │               └──┬───┘ └──┬───┘ └──┬───┘
│  Inventory Mod  │                  │        │        │
└─────────────────┘               ┌──┴───┐ ┌──┴───┐ ┌──┴───┐
 Single DB                        │ DB   │ │ DB   │ │ DB   │
                                  └──────┘ └──────┘ └──────┘
                                  Database per service

Communication Patterns

Synchronous (request-response):

  • REST over HTTP/HTTPS
  • gRPC for high-performance internal communication
  • GraphQL for flexible client queries

Asynchronous (event-driven):

  • Message queues (SQS, Cloud Tasks) for point-to-point
  • Pub/sub topics (SNS, Pub/Sub) for fan-out
  • Event streaming (Kinesis, Kafka) for ordered, replayable events

Service Discovery

| Approach | Implementation | Pros/Cons | |----------|---------------|-----------| | DNS-based | Route 53, Cloud DNS | Simple; TTL caching delays | | Service registry | Consul, Eureka | Rich metadata; added complexity | | Platform-native | K8s Services, Cloud Map | Integrated; platform-specific | | Service mesh | Envoy sidecar | Transparent; resource overhead |

Service Mesh

A service mesh manages service-to-service communication with a dedicated infrastructure layer.

Architecture

┌─────────────────────────────────────────────┐
│                 Control Plane                │
│  (Istio/Linkerd: config, certs, policies)   │
└──────┬──────────────┬──────────────┬────────┘
       │              │              │
┌──────▼──────┐ ┌─────▼──────┐ ┌────▼───────┐
│ ┌─────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │
│ │ Service │ │ │ │Service │ │ │ │Service │ │
│ │    A    │ │ │ │   B    │ │ │ │   C    │ │
│ └────┬────┘ │ │ └───┬────┘ │ │ └───┬────┘ │
│ ┌────▼────┐ │ │ ┌───▼────┐ │ │ ┌───▼────┐ │
│ │ Envoy   │◄├─┤►│ Envoy  │◄├─┤►│ Envoy  │ │
│ │ Sidecar │ │ │ │Sidecar │ │ │ │Sidecar │ │
│ └─────────┘ │ │ └────────┘ │ │ └────────┘ │
└─────────────┘ └────────────┘ └────────────┘
      Data Plane (proxies handle all traffic)

Service Mesh Capabilities

  • mTLS: Automatic mutual TLS between all services
  • Traffic management: Canary deployments, traffic splitting, retries

Deployment Strategies: Rolling, Blue-Green, Canary, A/B

  • Observability: Distributed tracing, metrics, access logs without code changes
  • Resilience: Circuit breaking, rate limiting, timeouts, fault injection

Mesh Options

| Mesh | Proxy | Complexity | Cloud Integration | |------|-------|-----------|-------------------| | Istio | Envoy | High | AWS App Mesh, GKE built-in | | Linkerd | linkerd2-proxy | Medium | Lightweight, Rust-based | | Consul Connect | Envoy | Medium | HashiCorp ecosystem | | AWS App Mesh | Envoy | Low | Deep AWS service integration |

Event-Driven Architecture

Amazon EventBridge

Event Sources              EventBridge              Targets
┌─────────┐               ┌──────────┐            ┌─────────┐
│ AWS Svc │──────────────►│          │───────────►│ Lambda  │
│ (S3,EC2)│               │  Event   │            └─────────┘
└─────────┘               │   Bus    │            ┌─────────┐
┌─────────┐               │          │───────────►│  SQS    │
│ SaaS    │──────────────►│  Rules   │            └─────────┘
│(Stripe) │               │  match   │            ┌─────────┐
└─────────┘               │  and     │───────────►│Step Fn  │
┌─────────┐               │  route   │            └─────────┘
│ Custom  │──────────────►│          │            ┌─────────┐
│  App    │               │          │───────────►│  API    │
└─────────┘               └──────────┘            └─────────┘
  • Event buses: Default, custom, and SaaS partner buses
  • Rules: Match events by pattern (source, detail-type, fields)
  • Schema registry: Discover and validate event schemas
  • Archive and replay: Store events and replay for debugging

Google Cloud Pub/Sub

  • Fully managed messaging with at-least-once delivery
  • Push and pull subscription modes
  • Dead-letter topics for failed message handling
  • Ordering keys for message sequencing within a partition
  • BigQuery subscriptions for direct analytics ingestion

Event Patterns

| Pattern | Description | Example | |---------|-------------|---------| | Event notification | Inform consumers of state change | Order placed, user signed up | | Event-carried state | Include full state in event | Avoid callback to source | | Event sourcing | Store state as sequence of events | Audit trail, temporal queries | | CQRS | Separate read and write models | Scale reads independently |

Stateless Design

Principles

All application instances must be interchangeable. State belongs in external stores.

Stateful (avoid)                  Stateless (prefer)
┌─────────┐                       ┌─────────┐
│ Server  │                       │ Server  │──► Redis (sessions)
│ sessions│                       │ (no     │──► S3 (uploads)
│ uploads │                       │  local  │──► RDS (data)
│ cache   │                       │  state) │──► ElastiCache
└─────────┘                       └─────────┘
  ✗ Can't scale horizontally       ✓ Any instance handles any request
  ✗ Sticky sessions required       ✓ Load balancer distributes freely
  ✗ Instance failure = data loss   ✓ Instance failure = no data loss

Externalized State Stores

| State Type | Service | Access Pattern | |-----------|---------|----------------| | Session data | Redis, Memcached | Low-latency key-value | | File uploads | S3, GCS | Presigned URLs | | Configuration | Parameter Store, Consul | Key-value with versioning | | Feature flags | LaunchDarkly, ConfigCat | Real-time evaluation | | Distributed locks | Redis (Redlock), DynamoDB | Conditional writes |

Health Checks and Readiness

Health Check Types

# Kubernetes health probes
livenessProbe:          # Is the process alive?
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 15
  periodSeconds: 10
  failureThreshold: 3   # Restart after 3 failures

readinessProbe:         # Can it serve traffic?
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

startupProbe:           # Has it finished starting?
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10     # Up to 300s to start

Load Balancer Health Checks

  • ALB checks target health before routing traffic
  • Unhealthy targets are removed from rotation
  • Grace period allows time for initialization before checking
  • Deregistration delay drains connections before removal

Cloud-Native Patterns

Sidecar Pattern

A helper container deployed alongside the main application container.

Pod
┌──────────────────────────────┐
│  ┌──────────┐  ┌──────────┐ │
│  │   App    │  │ Sidecar  │ │
│  │Container │  │(log agent│ │
│  │          │  │ proxy,   │ │
│  │          │  │ auth)    │ │
│  └──────────┘  └──────────┘ │
│     shared volumes/network   │
└──────────────────────────────┘

Use cases: Log collection (Fluentd), service mesh proxy (Envoy), secrets injection (Vault agent).

Ambassador Pattern

A proxy that acts as an intermediary for outbound connections.

  • Handles connection pooling, retries, and circuit breaking
  • Centralizes client-side logic outside the application
  • Implemented as a sidecar proxy (e.g., Envoy with custom config)

Additional Cloud-Native Patterns

| Pattern | Purpose | Implementation | |---------|---------|----------------| | Strangler Fig | Incremental migration from monolith | API Gateway routes to old/new | | Bulkhead | Isolate failures between components | Separate thread pools, services | | Circuit Breaker | Prevent cascading failures | Envoy, Hystrix, resilience4j | | Saga | Distributed transactions | Step Functions, Temporal | | Outbox | Reliable event publishing | DB + CDC (Debezium) | | Backend for Frontend | Tailored APIs per client type | Separate BFF services |

Key Takeaways

  • The twelve-factor methodology provides a blueprint for cloud-native application design
  • Microservices enable independent deployment and scaling but add distributed systems complexity
  • Service meshes handle cross-cutting concerns (mTLS, retries, observability) transparently
  • Event-driven architecture decouples producers and consumers for better scalability
  • Stateless design is foundational; all persistent state must live in external stores
  • Cloud-native patterns (sidecar, circuit breaker, saga) solve recurring distributed problems