3 min read
On this page

Encryption & Data Protection

Encryption transforms readable data into an unreadable form that can only be reversed with the correct key. It is the fundamental mechanism for protecting data confidentiality, whether data is moving across a network or sitting in a database.

TLS (Transport Layer Security)

TLS encrypts data in transit between clients and servers. Every HTTPS connection uses TLS.

TLS Handshake

TLS 1.3 handshake (simplified):

  Client --> Server: ClientHello
    Supported cipher suites, key share, random value
    
  Server --> Client: ServerHello
    Chosen cipher suite, key share, certificate
    
  Both sides compute shared secret from key shares (ECDHE)
  
  Client --> Server: Finished (encrypted)
  Server --> Client: Finished (encrypted)
  
  All subsequent data is encrypted with the shared secret
  
  TLS 1.3 completes in 1 round trip (1-RTT)
  TLS 1.2 required 2 round trips (2-RTT)

Certificate Verification

Certificate chain verification:
  Server presents: server certificate, signed by intermediate CA
  Client verifies:
    1. Certificate matches the domain (CN or SAN)
    2. Certificate is not expired
    3. Signature chain leads to a trusted root CA
    4. Certificate is not revoked (OCSP or CRL check)
  
  If any check fails: connection terminated, user sees security warning

TLS Best Practices

Configuration guidelines:
  - Use TLS 1.3 (or minimum TLS 1.2)
  - Disable TLS 1.0 and 1.1 (known vulnerabilities)
  - Use strong cipher suites (ECDHE for key exchange, AES-256-GCM)
  - Enable HSTS (force browsers to always use HTTPS)
  - Automate certificate renewal (Let's Encrypt, AWS ACM)
  - Pin certificates only for mobile apps (not web, too fragile)

Cloudflare terminates TLS for millions of websites, handling the certificate management and handshake overhead at the edge so origin servers do not need to.

Encryption at Rest

Encryption at rest protects stored data. If someone gains access to the storage medium (disk, backup, database dump), encrypted data remains unreadable.

Levels of Encryption at Rest

Full-disk encryption:
  Entire storage volume is encrypted
  Transparent to applications
  Protects against physical theft of drives
  AWS EBS encryption, Azure Disk Encryption
  
  Limitation: data is decrypted when the OS is running.
  Anyone with OS-level access reads plaintext.

Database-level encryption (TDE):
  Database engine encrypts data files
  Transparent to applications (queries return plaintext)
  SQL Server TDE, Oracle TDE, PostgreSQL pgcrypto
  
  Limitation: database administrators see plaintext.
  Backups may or may not be encrypted depending on config.

Application-level encryption:
  Application encrypts specific fields before storing
  Database only sees ciphertext
  Even database admins cannot read the data
  
  Example: encrypt SSN before INSERT, decrypt after SELECT
  
  Limitation: cannot query or index encrypted fields.
  More complex key management in application code.

Envelope Encryption

Envelope encryption uses two layers of keys to balance security with performance.

Envelope encryption:
  Data Encryption Key (DEK): encrypts the actual data
    Generated per record or per batch
    Symmetric key (AES-256), fast encryption
    
  Key Encryption Key (KEK): encrypts the DEK
    Stored in a key management service (KMS)
    Never leaves the KMS boundary
    
  Process:
    1. Generate random DEK
    2. Encrypt data with DEK
    3. Encrypt DEK with KEK (via KMS API call)
    4. Store encrypted data + encrypted DEK together
    5. Discard plaintext DEK from memory
    
  Decryption:
    1. Read encrypted DEK from storage
    2. Send encrypted DEK to KMS for decryption
    3. KMS returns plaintext DEK
    4. Decrypt data with plaintext DEK
    5. Discard plaintext DEK from memory

AWS S3 uses envelope encryption for server-side encryption. Each object gets a unique DEK, and the DEK is encrypted by a KMS-managed KEK. This means compromising one DEK exposes only one object, not the entire bucket.

Key Management

Key management is the hardest part of encryption. The data is only as secure as the keys that protect it.

Key Management Services

Cloud KMS options:
  AWS KMS: managed key storage, automatic rotation, audit logging
  Google Cloud KMS: similar capabilities, supports HSM-backed keys
  Azure Key Vault: keys, secrets, and certificates in one service
  HashiCorp Vault: cloud-agnostic, supports many secret types

KMS capabilities:
  - Generate cryptographic keys
  - Encrypt and decrypt data (key never leaves KMS)
  - Automatic key rotation (e.g., yearly)
  - Access policies (which services can use which keys)
  - Audit logs (who used which key and when)

Key Rotation

Key rotation without downtime:

  Phase 1: Create new key version
    KEK v1 (active for decrypt)
    KEK v2 (active for encrypt and decrypt)
    
  Phase 2: Re-encrypt data with new key
    Read data encrypted with KEK v1
    Decrypt with KEK v1, re-encrypt with KEK v2
    (can be done gradually, background process)
    
  Phase 3: Retire old key
    All data re-encrypted with KEK v2
    KEK v1 marked as disabled (keep for emergency)
    Eventually delete KEK v1 after retention period

  Key metadata stored with ciphertext:
    { keyVersion: "v2", encryptedDEK: "...", ciphertext: "..." }

Key Hierarchy

Key hierarchy for large organizations:

  Root key (HSM-protected, never exported)
    |
    --> Regional keys (one per data center region)
          |
          --> Service keys (one per application/service)
                |
                --> Data encryption keys (one per record or batch)

  Each level encrypts the keys at the level below.
  Compromising a service key affects only that service.
  Compromising a regional key affects all services in that region.
  Root key compromise would be catastrophic (hence HSM protection).

Hashing

Hashing is a one-way function that produces a fixed-size output from arbitrary input. Unlike encryption, hashing cannot be reversed.

Password Hashing

Password hashing (NOT encryption):
  Passwords should be hashed, not encrypted.
  If encrypted, whoever has the key can read all passwords.
  If hashed, no one can reverse the hash to get the password.

Recommended algorithms:
  bcrypt:  adaptive cost factor, built-in salt, widely supported
  scrypt:  memory-hard (resists GPU attacks), configurable parameters
  Argon2:  winner of Password Hashing Competition, best option for new systems

NEVER use for passwords:
  MD5:     broken, collisions found, too fast
  SHA-256: too fast (billions of hashes per second on GPU)
  
  "Too fast" is bad for password hashing because attackers can
  brute-force faster.

Hashing vs Encryption

| Aspect          | Hashing                    | Encryption                  |
|-----------------|----------------------------|-----------------------------|
| Reversible      | No (one-way)               | Yes (with key)              |
| Output size     | Fixed (256 bits for SHA-256)| Proportional to input       |
| Key required    | No                         | Yes                         |
| Use case        | Password storage, integrity | Data confidentiality        |
| Same input      | Always same output          | Different output each time  |
|                 |                            | (with different IV/nonce)   |

Data Integrity Hashing

Integrity verification:
  File download:
    Server provides: file + SHA-256 hash
    Client downloads file, computes SHA-256
    If hashes match: file was not tampered with
    
  API request signing:
    HMAC-SHA256(request_body, secret_key) = signature
    Server recomputes HMAC and compares
    Verifies request was not modified in transit

Secrets Management

Secrets include API keys, database passwords, encryption keys, certificates, and tokens. Managing them securely is critical because a leaked secret can compromise the entire system.

Anti-Patterns

Where secrets should NOT live:
  - Hardcoded in source code
  - In git repositories (even private ones)
  - In environment variables in plain text on disk
  - In configuration files committed to version control
  - In container images
  - In chat messages or emails
  - In shared documents or wikis

Secrets Management Solutions

HashiCorp Vault:
  - Dynamic secrets: generates temporary database credentials on demand
  - Lease-based: secrets expire automatically
  - Audit logging: every secret access is logged
  - Multiple auth methods: AppRole, Kubernetes, AWS IAM

AWS Secrets Manager:
  - Automatic rotation for RDS, Redshift, DocumentDB
  - Cross-account sharing with IAM policies
  - Versioned secrets with staged rotation

Kubernetes Secrets:
  - Base64 encoded (NOT encrypted by default)
  - Encrypt at rest with KMS provider
  - Mount as files or environment variables in pods
  - Limited access via RBAC

Secret Rotation

Automated secret rotation:
  1. Secret manager generates new credential
  2. Updates the dependent service (e.g., database user password)
  3. Applications fetch new credential on next access
  4. Old credential remains valid during transition window
  5. Old credential is revoked after all consumers have updated
  
  Zero-downtime rotation requires:
    - Applications that fetch secrets at runtime (not startup only)
    - A transition period where both old and new credentials work
    - Monitoring to detect applications still using old credentials

Spotify rotates database credentials automatically through Vault. Applications request short-lived credentials that expire after hours, limiting the blast radius if credentials are leaked.

Common Pitfalls

  • Using encryption without key management. Encrypting data and storing the key next to it (in the same database, same server, same config file) is security theater. Use a separate KMS.
  • Rolling your own cryptography. Custom encryption algorithms and protocols almost always have vulnerabilities. Use well-tested libraries and standard algorithms (AES-256-GCM, RSA-2048+, ECDSA).
  • Encrypting passwords instead of hashing them. If you encrypt passwords, the encryption key becomes a single point of compromise for all user credentials. Hash with bcrypt or Argon2 instead.
  • Forgetting backup encryption. Encrypting the database but storing plaintext backups in S3 exposes all data if the bucket is misconfigured. Encrypt backups with separate keys.
  • Not rotating secrets. Secrets that never change accumulate risk over time. Former employees, compromised machines, and leaked logs all increase exposure. Rotate regularly and automate it.
  • TLS termination without re-encryption. Terminating TLS at the load balancer and sending plaintext to backend services means internal network traffic is unprotected. Use TLS or mTLS for internal communication.

Key Takeaways

  • TLS protects data in transit. Encryption at rest protects stored data. Both are necessary for defense in depth.
  • Envelope encryption (DEK encrypted by KEK) is the standard pattern for encrypting data at scale without performance penalties.
  • Key management is harder than encryption itself. Use a dedicated KMS, implement key rotation, and maintain a clear key hierarchy.
  • Hash passwords with bcrypt or Argon2. Never encrypt them, never use fast hashes like SHA-256, and never store them in plain text.
  • Secrets management requires dedicated tooling (Vault, Secrets Manager), automated rotation, and runtime fetching rather than static configuration files.