Cloud Fundamentals
Cloud computing is renting someone else's computers instead of buying your own. That is the one-sentence version. The longer version involves understanding the different service models, the infrastructure primitives, and the shared responsibility model that determines what you manage versus what the cloud provider manages.
Service Models
Cloud services are organized into three layers, each abstracting more infrastructure from you.
IaaS (Infrastructure as a Service)
IaaS gives you virtual machines, storage, and networking. You manage everything from the operating system up. The cloud provider manages the physical hardware, power, cooling, and network connectivity.
IaaS examples:
Compute: AWS EC2, Google Compute Engine, Azure VMs
Storage: AWS EBS, Google Persistent Disk, Azure Managed Disks
Networking: AWS VPC, Google VPC, Azure VNet
You manage: OS, runtime, application, data, middleware, scaling
Provider manages: Hardware, power, cooling, physical network, hypervisor
IaaS is the most flexible and the most work. You get full control over the operating system, networking, and security configuration. You also get all the operational burden.
PaaS (Platform as a Service)
PaaS provides a managed platform for running applications. You deploy your code and the platform handles the runtime, scaling, patching, and infrastructure.
PaaS examples:
AWS Elastic Beanstalk, Google App Engine, Azure App Service
Heroku, Railway, Render, Fly.io
AWS RDS (managed databases), Google Cloud SQL
You manage: Application code, data
Provider manages: Runtime, OS, scaling, patching, networking, hardware
PaaS is less flexible but dramatically less work. You cannot customize the operating system or install arbitrary software, but you do not have to patch servers or configure load balancers.
SaaS (Software as a Service)
SaaS is a complete application delivered over the internet. You do not manage anything except your data and configuration.
SaaS examples:
GitHub, Slack, Datadog, PagerDuty, Stripe
Google Workspace, Microsoft 365
You manage: Configuration, data, user accounts
Provider manages: Everything else
Core Infrastructure Primitives
Regardless of which cloud provider you use, the building blocks are the same.
Compute
Compute is processing power. The options range from full virtual machines to serverless functions.
Virtual Machines (IaaS):
AWS EC2, Google Compute Engine, Azure VMs
Full OS control. You choose CPU, memory, disk.
Best for: legacy apps, custom runtimes, full control
Containers (managed):
AWS ECS/EKS, Google Cloud Run/GKE, Azure AKS
Run Docker containers without managing VMs.
Best for: microservices, portable workloads
Serverless Functions:
AWS Lambda, Google Cloud Functions, Azure Functions
Run code in response to events. No servers to manage.
Pay per invocation.
Best for: event-driven workloads, APIs with variable traffic
Comparison:
VM: Most control, most work, always running
Container: Good balance, portable, can scale to zero
Serverless: Least control, least work, pay only for use
Storage
Object Storage:
AWS S3, Google Cloud Storage (GCS), Azure Blob Storage
Store files (images, videos, backups, logs).
Unlimited capacity. Pay per GB stored + requests.
Not a filesystem. No folder hierarchy (it's faked).
Block Storage:
AWS EBS, Google Persistent Disk, Azure Managed Disks
Virtual hard drives attached to VMs.
Low latency. Fixed capacity (you choose the size).
File Storage:
AWS EFS, Google Filestore, Azure Files
Shared filesystem accessible by multiple VMs.
NFS or SMB protocols. Higher latency than block storage.
Databases
Relational (managed):
AWS RDS, Google Cloud SQL, Azure SQL Database
PostgreSQL, MySQL, SQL Server managed by the provider.
Automated backups, patching, failover.
NoSQL (managed):
AWS DynamoDB, Google Firestore, Azure Cosmos DB
Key-value, document, or wide-column stores.
Scales horizontally. Different consistency models.
In-memory:
AWS ElastiCache, Google Memorystore
Redis or Memcached managed by the provider.
Caching, session storage, real-time data.
Networking
VPC (Virtual Private Cloud):
Your own isolated network in the cloud.
You define IP ranges, subnets, and routing.
Subnets:
Subdivisions of your VPC.
Public subnets: accessible from the internet.
Private subnets: only accessible within the VPC.
Load Balancers:
Distribute traffic across multiple instances.
AWS ALB/NLB, Google Cloud Load Balancing, Azure Load Balancer.
Application LB: HTTP/HTTPS, path-based routing
Network LB: TCP/UDP, low latency, high throughput
DNS:
AWS Route 53, Google Cloud DNS, Azure DNS
Managed DNS with health checks and failover.
Regions & Availability Zones
Cloud providers operate data centers across the globe, organized into regions and availability zones.
Region:
A geographic area (us-east-1, eu-west-1, ap-southeast-1).
Each region is independent. Data does not automatically
replicate between regions.
Availability Zone (AZ):
An isolated data center within a region.
Each region has 2-6 AZs.
AZs have independent power, cooling, and networking.
Connected by low-latency links.
Example: us-east-1 (Virginia)
us-east-1a — Data center A
us-east-1b — Data center B
us-east-1c — Data center C
us-east-1d — Data center D
us-east-1e — Data center E
us-east-1f — Data center F
Why This Matters
Single AZ deployment:
If the AZ has a power failure, your service is down.
Multi-AZ deployment:
Your service runs in 2+ AZs. If one AZ fails,
the others handle traffic. This is the minimum for
production workloads.
Multi-region deployment:
Your service runs in 2+ regions. If an entire region
fails (rare but happens), the other region handles traffic.
Complex and expensive. Only for critical services.
The Shared Responsibility Model
The cloud provider and you share responsibility for security and operations. What each party owns depends on the service model.
You manage Provider manages
IaaS (EC2): App, data, OS, Hardware, hypervisor,
network config, physical security,
firewall rules physical network
PaaS (RDS): Data, access App runtime, OS,
control, queries patching, backups,
hardware, networking
SaaS (S3): Data, access Everything else
policies, encryption
settings
Common misunderstanding:
"We're in the cloud, so security is the cloud provider's problem."
Reality:
The cloud provider secures the infrastructure.
You secure what you put on it.
A misconfigured S3 bucket is your problem, not AWS's.
An unpatched EC2 instance is your problem, not AWS's.
Why Cloud Wins for Most Companies
Building and operating your own data center is expensive, slow, and distracting. For most companies, the cloud is the right choice.
Your own data center:
- Capital expense: $500K-$5M upfront for hardware
- Lead time: 3-6 months to procure and set up
- Staffing: need network engineers, sysadmins, physical security
- Scaling: buy new hardware (weeks), hope you sized it right
- Redundancy: you build it yourself (expensive)
- Location: one or two data centers, limited geographic reach
Cloud:
- Operating expense: pay monthly, scale up and down
- Lead time: minutes to provision new resources
- Staffing: no hardware engineers needed
- Scaling: click a button or set an auto-scaling policy
- Redundancy: multi-AZ and multi-region available
- Location: deploy globally (20+ regions across providers)
When On-Premise Still Makes Sense
Situations where your own infrastructure may be better:
- Extremely predictable, stable workloads (cheaper at scale)
- Data sovereignty requirements (government, healthcare)
- Ultra-low latency requirements (gaming, HFT)
- Massive compute at scale (ML training at hyperscaler level)
- Companies large enough to have dedicated infra teams
Real-World Example
A startup began on a single AWS EC2 instance running their monolithic application, a PostgreSQL database, and nginx as a reverse proxy. Monthly cost: $50.
As they grew to 10,000 users:
- Moved the database to RDS (managed PostgreSQL) -- eliminated manual backups and patching
- Added an Application Load Balancer and two EC2 instances -- survived instance failures
- Put static assets on S3 and CloudFront (CDN) -- reduced latency globally
- Added ElastiCache (Redis) for session storage -- improved response times
Monthly cost grew to 20,000+ in hardware alone.
Common Pitfalls
- Treating cloud like a data center -- Running everything on EC2 instances and managing them manually misses the point; use managed services where possible
- Ignoring multi-AZ -- Running production in a single availability zone means a data center failure takes your service down; always deploy across at least two AZs
- No cost monitoring -- Cloud costs grow silently; a forgotten instance or an oversized database can cost thousands before anyone notices
- Over-engineering from day one -- A startup does not need multi-region with Kubernetes on day one; start simple, add complexity as you grow
- Not understanding the shared responsibility model -- Assuming the cloud provider handles security leads to misconfigured resources and data breaches
- Vendor lock-in fear paralysis -- Some lock-in is inevitable and acceptable; the cost of avoiding all lock-in (abstraction layers, multi-cloud) is usually higher than the cost of switching later
Key Takeaways
- Cloud computing comes in three layers: IaaS (you manage the OS), PaaS (you manage the code), SaaS (you manage the configuration)
- The core primitives are compute (VMs, containers, serverless), storage (object, block, file), databases (relational, NoSQL, in-memory), and networking (VPC, subnets, load balancers)
- Regions are geographic areas; availability zones are independent data centers within a region; always deploy across multiple AZs for production
- The shared responsibility model means the cloud provider secures the infrastructure, but you secure everything you put on it
- Cloud wins for most companies because of speed, flexibility, and reduced operational burden
- Start simple with managed services and add complexity only when you need it