Infrastructure as Code
IaC Principles
Infrastructure as Code (IaC) manages infrastructure through machine-readable definition files rather than manual processes.
Core Properties
| Principle | Description | Benefit | |-----------|-------------|---------| | Declarative | Define desired state, not steps | Engine determines how to reach state | | Idempotent | Applying same config repeatedly yields same result | Safe to re-run after failures | | Versioned | Stored in version control | Audit trail, rollback, code review | | Reproducible | Same config produces identical infrastructure | Consistent environments | | Self-documenting | Code is the documentation | No config drift from undocumented changes |
Declarative vs Imperative
Declarative (Terraform, CloudFormation): Imperative (scripts, Pulumi):
"I want 3 servers with 4 GB RAM" "Create server A, then B, then C"
→ Engine figures out the plan → You specify every step
→ Handles create/update/delete → You handle state transitions
→ Converges to desired state → You manage ordering
Terraform
Terraform is the most widely adopted multi-cloud IaC tool, using HashiCorp Configuration Language (HCL).
Architecture
┌───────────┐ ┌───────────┐ ┌───────────────┐
│ HCL Code │────►│ Terraform │────►│ Providers │
│ (.tf) │ │ Core │ │ ┌───────────┐ │
└───────────┘ │ │ │ │ AWS │ │
┌───────────┐ │ Plan ──► │ │ │ GCP │ │
│ State │◄───►│ Apply │ │ │ Azure │ │
│ (.tfstate)│ │ Destroy │ │ │ Kubernetes│ │
└───────────┘ └───────────┘ │ │ Datadog │ │
│ └───────────┘ │
└───────────────┘
Providers
Providers are plugins that interface with APIs of cloud platforms and services.
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
default_tags {
tags = {
Environment = var.environment
ManagedBy = "terraform"
}
}
}
Resource Configuration
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = { Name = "${var.project}-vpc" }
}
resource "aws_subnet" "private" {
count = length(var.azs)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = var.azs[count.index]
}
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
subnet_id = aws_subnet.private[0].id
lifecycle {
create_before_destroy = true
prevent_destroy = false
}
}
Modules
Modules are reusable, encapsulated infrastructure components.
# Using a module
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.1.0"
name = "production"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
single_nat_gateway = var.environment != "production"
}
State Management
Terraform state maps real-world resources to configuration.
# Remote state backend (required for team collaboration)
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks" # State locking
encrypt = true
}
}
State best practices:
- Always use remote backends (S3, GCS, Azure Blob, Terraform Cloud)
- Enable state locking to prevent concurrent modifications
- Use separate state files per environment and component
- Never commit state files to version control (contains secrets)
- Use
terraform importfor existing resources
Terraform Workflow
terraform init → Download providers and modules
terraform plan → Preview changes (diff current vs desired)
terraform apply → Execute the plan
terraform destroy → Remove all managed resources
terraform fmt → Format code consistently
terraform validate → Check syntax and internal consistency
terraform state → Inspect and manipulate state
Pulumi
Pulumi uses general-purpose programming languages instead of DSLs.
import * as aws from "@pulumi/aws";
const bucket = new aws.s3.Bucket("my-bucket", {
versioning: { enabled: true },
serverSideEncryptionConfiguration: {
rule: {
applyServerSideEncryptionByDefault: {
sseAlgorithm: "aws:kms",
},
},
},
});
// Use real programming constructs
const topics = ["orders", "payments", "inventory"];
const snsTopics = topics.map(name =>
new aws.sns.Topic(name, { name: `${stack}-${name}` })
);
export const bucketName = bucket.id;
Advantages over Terraform: Familiar languages (TypeScript, Python, Go, C#), real conditionals/loops, IDE support, unit testing with standard frameworks.
AWS CDK
AWS CDK generates CloudFormation templates from high-level constructs.
Construct Levels
L1 (Cfn): Direct CloudFormation resources → CfnBucket
L2 (Curated): Sensible defaults + helpers → Bucket (encryption, versioning)
L3 (Patterns): Multi-resource architectures → LambdaRestApi (API GW + Lambda + IAM)
import { Stack, RemovalPolicy } from 'aws-cdk-lib';
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as s3n from 'aws-cdk-lib/aws-s3-notifications';
const bucket = new s3.Bucket(this, 'DataBucket', {
encryption: s3.BucketEncryption.S3_MANAGED,
versioned: true,
removalPolicy: RemovalPolicy.RETAIN,
lifecycleRules: [{
transitions: [{ storageClass: s3.StorageClass.GLACIER, transitionAfter: Duration.days(90) }],
}],
});
const processor = new lambda.Function(this, 'Processor', {
runtime: lambda.Runtime.PYTHON_3_12,
handler: 'index.handler',
code: lambda.Code.fromAsset('lambda/'),
});
bucket.addEventNotification(s3.EventType.OBJECT_CREATED, new s3n.LambdaDestination(processor));
bucket.grantRead(processor); // Automatically creates IAM policy
CloudFormation
AWS-native IaC using JSON/YAML templates. CDK and SAM compile down to CloudFormation.
- Stacks: Unit of deployment, creates/updates/deletes as a group
- Change sets: Preview changes before applying
- Drift detection: Identify manual changes outside IaC
- StackSets: Deploy across multiple accounts and regions
Crossplane
Crossplane extends Kubernetes to manage cloud infrastructure using the Kubernetes API.
apiVersion: database.aws.crossplane.io/v1beta1
kind: RDSInstance
metadata:
name: production-db
spec:
forProvider:
region: us-east-1
dbInstanceClass: db.r6g.xlarge
engine: postgres
engineVersion: "15"
masterUsername: admin
allocatedStorage: 100
writeConnectionSecretToRef:
name: db-credentials
namespace: production
- Manages infrastructure using
kubectland Kubernetes reconciliation loops - Compositions group multiple resources into reusable platform APIs
- Native integration with Kubernetes RBAC and namespaces
GitOps
GitOps uses Git as the single source of truth for infrastructure and application state.
ArgoCD
Git Repository ArgoCD Kubernetes
┌─────────────┐ Sync ┌──────────┐ ┌──────────┐
│ manifests/ │◄── monitors ────►│ ArgoCD │── applies ───►│ Cluster │
│ kustomize/ │ for changes │ Server │ │ │
│ helm/ │ │ │◄── reports ───┤ │
└─────────────┘ └──────────┘ status └──────────┘
Flux
- Lightweight GitOps operator, CNCF graduated project
- Source controllers pull from Git, Helm, S3, OCI
- Kustomization controller applies manifests with dependency ordering
- Supports multi-tenancy with namespaced access control
GitOps Principles
- Declarative: Entire system described declaratively in Git
- Versioned: Git history provides audit trail and rollback
- Automated: Approved changes are applied automatically
- Reconciled: Agents continuously correct drift from desired state
Policy as Code
Open Policy Agent (OPA)
OPA evaluates policies written in Rego against structured data.
# Deny resources without required tags
package terraform.analysis
deny[msg] {
resource := input.resource_changes[_]
not resource.change.after.tags["Environment"]
msg := sprintf("Resource %s missing required 'Environment' tag", [resource.address])
}
# Enforce encryption on S3 buckets
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
not has_encryption(resource)
msg := sprintf("S3 bucket %s must have encryption enabled", [resource.address])
}
Policy Tools Comparison
| Tool | Integration | Language | Use Case | |------|-------------|----------|----------| | OPA/Conftest | Terraform, K8s, CI/CD | Rego | General policy evaluation | | Sentinel | Terraform Cloud/Enterprise | Sentinel | Terraform-specific governance | | Checkov | Terraform, CloudFormation, K8s | Python/YAML | Security scanning | | tfsec/trivy | Terraform | Go | Static security analysis | | Kyverno | Kubernetes-native | YAML | K8s admission control |
IaC Tool Selection
Multi-cloud required?
└─ Yes → Team prefers DSL?
└─ Yes → Terraform
└─ No → Pulumi
└─ No → AWS only?
└─ Yes → CDK or SAM (serverless)
└─ No → Kubernetes-centric?
└─ Yes → Crossplane
└─ No → Provider-native (CloudFormation, Deployment Manager)
Key Takeaways
- IaC enables reproducible, version-controlled, reviewable infrastructure changes
- Terraform is the de facto standard for multi-cloud IaC with its provider ecosystem
- Pulumi and CDK bring general-purpose languages and IDE support to IaC
- State management is critical; always use remote backends with locking
- GitOps (ArgoCD, Flux) applies Git-based workflows to Kubernetes operations
- Policy as code (OPA, Sentinel) enforces governance guardrails in CI/CD pipelines