Cloud Fundamentals

Cloud computing is renting someone else's computers instead of buying your own. That is the one-sentence version. The longer version involves understanding the different service models, the infrastructure primitives, and the shared responsibility model that determines what you manage versus what the cloud provider manages.

Service Models

Cloud services are organized into three layers, each abstracting more infrastructure from you.

IaaS (Infrastructure as a Service)

IaaS gives you virtual machines, storage, and networking. You manage everything from the operating system up. The cloud provider manages the physical hardware, power, cooling, and network connectivity.

IaaS examples:
  Compute:    AWS EC2, Google Compute Engine, Azure VMs
  Storage:    AWS EBS, Google Persistent Disk, Azure Managed Disks
  Networking: AWS VPC, Google VPC, Azure VNet

You manage: OS, runtime, application, data, middleware, scaling
Provider manages: Hardware, power, cooling, physical network, hypervisor

IaaS is the most flexible and the most work. You get full control over the operating system, networking, and security configuration. You also get all the operational burden.

PaaS (Platform as a Service)

PaaS provides a managed platform for running applications. You deploy your code and the platform handles the runtime, scaling, patching, and infrastructure.

PaaS examples:
  AWS Elastic Beanstalk, Google App Engine, Azure App Service
  Heroku, Railway, Render, Fly.io
  AWS RDS (managed databases), Google Cloud SQL

You manage: Application code, data
Provider manages: Runtime, OS, scaling, patching, networking, hardware

PaaS is less flexible but dramatically less work. You cannot customize the operating system or install arbitrary software, but you do not have to patch servers or configure load balancers.

SaaS (Software as a Service)

SaaS is a complete application delivered over the internet. You do not manage anything except your data and configuration.

SaaS examples:
  GitHub, Slack, Datadog, PagerDuty, Stripe
  Google Workspace, Microsoft 365

You manage: Configuration, data, user accounts
Provider manages: Everything else

Core Infrastructure Primitives

Regardless of which cloud provider you use, the building blocks are the same.

Compute

Compute is processing power. The options range from full virtual machines to serverless functions.

Virtual Machines (IaaS):
  AWS EC2, Google Compute Engine, Azure VMs
  Full OS control. You choose CPU, memory, disk.
  Best for: legacy apps, custom runtimes, full control

Containers (managed):
  AWS ECS/EKS, Google Cloud Run/GKE, Azure AKS
  Run Docker containers without managing VMs.
  Best for: microservices, portable workloads

Serverless Functions:
  AWS Lambda, Google Cloud Functions, Azure Functions
  Run code in response to events. No servers to manage.
  Pay per invocation.
  Best for: event-driven workloads, APIs with variable traffic

Comparison:
  VM:         Most control, most work, always running
  Container:  Good balance, portable, can scale to zero
  Serverless: Least control, least work, pay only for use

Storage

Object Storage:
  AWS S3, Google Cloud Storage (GCS), Azure Blob Storage
  Store files (images, videos, backups, logs).
  Unlimited capacity. Pay per GB stored + requests.
  Not a filesystem. No folder hierarchy (it's faked).
  
Block Storage:
  AWS EBS, Google Persistent Disk, Azure Managed Disks
  Virtual hard drives attached to VMs.
  Low latency. Fixed capacity (you choose the size).
  
File Storage:
  AWS EFS, Google Filestore, Azure Files
  Shared filesystem accessible by multiple VMs.
  NFS or SMB protocols. Higher latency than block storage.

Databases

Relational (managed):
  AWS RDS, Google Cloud SQL, Azure SQL Database
  PostgreSQL, MySQL, SQL Server managed by the provider.
  Automated backups, patching, failover.

NoSQL (managed):
  AWS DynamoDB, Google Firestore, Azure Cosmos DB
  Key-value, document, or wide-column stores.
  Scales horizontally. Different consistency models.

In-memory:
  AWS ElastiCache, Google Memorystore
  Redis or Memcached managed by the provider.
  Caching, session storage, real-time data.

Networking

VPC (Virtual Private Cloud):
  Your own isolated network in the cloud.
  You define IP ranges, subnets, and routing.

Subnets:
  Subdivisions of your VPC.
  Public subnets: accessible from the internet.
  Private subnets: only accessible within the VPC.

Load Balancers:
  Distribute traffic across multiple instances.
  AWS ALB/NLB, Google Cloud Load Balancing, Azure Load Balancer.
  
  Application LB: HTTP/HTTPS, path-based routing
  Network LB: TCP/UDP, low latency, high throughput

DNS:
  AWS Route 53, Google Cloud DNS, Azure DNS
  Managed DNS with health checks and failover.

Regions & Availability Zones

Cloud providers operate data centers across the globe, organized into regions and availability zones.

Region:
  A geographic area (us-east-1, eu-west-1, ap-southeast-1).
  Each region is independent. Data does not automatically
  replicate between regions.

Availability Zone (AZ):
  An isolated data center within a region.
  Each region has 2-6 AZs.
  AZs have independent power, cooling, and networking.
  Connected by low-latency links.

Example: us-east-1 (Virginia)
  us-east-1a — Data center A
  us-east-1b — Data center B
  us-east-1c — Data center C
  us-east-1d — Data center D
  us-east-1e — Data center E
  us-east-1f — Data center F

Why This Matters

Single AZ deployment:
  If the AZ has a power failure, your service is down.
  
Multi-AZ deployment:
  Your service runs in 2+ AZs. If one AZ fails,
  the others handle traffic. This is the minimum for
  production workloads.

Multi-region deployment:
  Your service runs in 2+ regions. If an entire region
  fails (rare but happens), the other region handles traffic.
  Complex and expensive. Only for critical services.

The Shared Responsibility Model

The cloud provider and you share responsibility for security and operations. What each party owns depends on the service model.

                You manage          Provider manages
IaaS (EC2):    App, data, OS,      Hardware, hypervisor,
               network config,      physical security,
               firewall rules       physical network

PaaS (RDS):    Data, access         App runtime, OS,
               control, queries     patching, backups,
                                    hardware, networking

SaaS (S3):     Data, access         Everything else
               policies, encryption
               settings

Common misunderstanding:
  "We're in the cloud, so security is the cloud provider's problem."

Reality:
  The cloud provider secures the infrastructure.
  You secure what you put on it.
  A misconfigured S3 bucket is your problem, not AWS's.
  An unpatched EC2 instance is your problem, not AWS's.

Why Cloud Wins for Most Companies

Building and operating your own data center is expensive, slow, and distracting. For most companies, the cloud is the right choice.

Your own data center:
  - Capital expense: $500K-$5M upfront for hardware
  - Lead time: 3-6 months to procure and set up
  - Staffing: need network engineers, sysadmins, physical security
  - Scaling: buy new hardware (weeks), hope you sized it right
  - Redundancy: you build it yourself (expensive)
  - Location: one or two data centers, limited geographic reach

Cloud:
  - Operating expense: pay monthly, scale up and down
  - Lead time: minutes to provision new resources
  - Staffing: no hardware engineers needed
  - Scaling: click a button or set an auto-scaling policy
  - Redundancy: multi-AZ and multi-region available
  - Location: deploy globally (20+ regions across providers)

When On-Premise Still Makes Sense

Situations where your own infrastructure may be better:
  - Extremely predictable, stable workloads (cheaper at scale)
  - Data sovereignty requirements (government, healthcare)
  - Ultra-low latency requirements (gaming, HFT)
  - Massive compute at scale (ML training at hyperscaler level)
  - Companies large enough to have dedicated infra teams

Real-World Example

A startup began on a single AWS EC2 instance running their monolithic application, a PostgreSQL database, and nginx as a reverse proxy. Monthly cost: $50.

As they grew to 10,000 users:

Moved the database to RDS (managed PostgreSQL) -- eliminated manual backups and patching
Added an Application Load Balancer and two EC2 instances -- survived instance failures
Put static assets on S3 and CloudFront (CDN) -- reduced latency globally
Added ElastiCache (Redis) for session storage -- improved response times

Monthly cost grew to $800, but they had zero-downtime deploys, automatic database backups, and global content delivery. An equivalent on-premise setup would have cost$ 20,000+ in hardware alone.

Common Pitfalls

Treating cloud like a data center -- Running everything on EC2 instances and managing them manually misses the point; use managed services where possible
Ignoring multi-AZ -- Running production in a single availability zone means a data center failure takes your service down; always deploy across at least two AZs
No cost monitoring -- Cloud costs grow silently; a forgotten instance or an oversized database can cost thousands before anyone notices
Over-engineering from day one -- A startup does not need multi-region with Kubernetes on day one; start simple, add complexity as you grow
Not understanding the shared responsibility model -- Assuming the cloud provider handles security leads to misconfigured resources and data breaches
Vendor lock-in fear paralysis -- Some lock-in is inevitable and acceptable; the cost of avoiding all lock-in (abstraction layers, multi-cloud) is usually higher than the cost of switching later

Key Takeaways

Cloud computing comes in three layers: IaaS (you manage the OS), PaaS (you manage the code), SaaS (you manage the configuration)
The core primitives are compute (VMs, containers, serverless), storage (object, block, file), databases (relational, NoSQL, in-memory), and networking (VPC, subnets, load balancers)
Regions are geographic areas; availability zones are independent data centers within a region; always deploy across multiple AZs for production
The shared responsibility model means the cloud provider secures the infrastructure, but you secure everything you put on it
Cloud wins for most companies because of speed, flexibility, and reduced operational burden
Start simple with managed services and add complexity only when you need it