TLS & Certificates
TLS (Transport Layer Security) encrypts traffic between clients and servers. Without it, anyone on the network path can read passwords, API keys, and personal data in plain text. HTTPS everywhere is not a recommendation -- it is the baseline. Every production service, every API, every internal tool should use TLS.
How TLS Works
The TLS handshake establishes an encrypted connection before any application data is exchanged.
Client Server
|--- ClientHello ------------->| (supported ciphers, TLS version)
|<-- ServerHello --------------| (chosen cipher, server certificate)
|--- Key Exchange ------------>| (encrypted pre-master secret)
|<-- Finished -----------------| (handshake complete)
|<== Encrypted Traffic =======>|
The server proves its identity by presenting a certificate signed by a trusted Certificate Authority (CA). The client verifies the certificate chain, and both sides agree on encryption keys. After the handshake, all traffic is encrypted.
TLS Versions
- TLS 1.2: Still widely used and secure. Supported everywhere.
- TLS 1.3: Faster handshake (one fewer round trip), removes insecure cipher suites, improved security. Use this as the default.
- TLS 1.0, 1.1: Deprecated. Do not use. Disable them on your servers.
Certificate Chains
A certificate chain establishes trust from your server certificate to a root CA that browsers already trust.
Root CA (pre-installed in browsers/OS)
-> Intermediate CA (signed by root)
-> Your Server Certificate (signed by intermediate)
Your server must send both its own certificate and the intermediate certificate. The root certificate is already in the client's trust store.
# Check certificate chain
openssl s_client -connect myapp.example.com:443 -showcerts
If the intermediate certificate is missing, some clients (especially mobile devices and older systems) will reject the connection. This is one of the most common TLS misconfiguration issues.
Let's Encrypt & Automatic Renewal
Let's Encrypt is a free, automated CA. It issues domain-validated certificates and provides tooling for automatic renewal. There is no reason to pay for basic TLS certificates.
Certbot
The standard client for Let's Encrypt:
# Obtain a certificate
certbot certonly --webroot -w /var/www/html -d myapp.example.com
# Certificate files are stored at:
# /etc/letsencrypt/live/myapp.example.com/fullchain.pem
# /etc/letsencrypt/live/myapp.example.com/privkey.pem
Let's Encrypt certificates expire after 90 days. Certbot sets up automatic renewal via a cron job or systemd timer:
# Test renewal
certbot renew --dry-run
# Automatic renewal (added by certbot)
# /etc/cron.d/certbot
0 */12 * * * root certbot renew --quiet --post-hook "systemctl reload nginx"
DNS Challenge
For wildcard certificates or when your server is not publicly accessible, use DNS validation:
certbot certonly --dns-cloudflare \
--dns-cloudflare-credentials /etc/letsencrypt/cloudflare.ini \
-d "*.example.com" \
-d "example.com"
This adds a TXT record to your DNS to prove domain ownership. Works with most DNS providers through plugins.
TLS Termination at the Load Balancer
The standard pattern: terminate TLS at the load balancer or reverse proxy, then forward unencrypted traffic to backend servers on the internal network.
Internet --[TLS]--> Load Balancer --[plain HTTP]--> Backend Servers
(has the cert) (internal network)
Nginx TLS Termination
server {
listen 443 ssl http2;
server_name myapp.example.com;
ssl_certificate /etc/letsencrypt/live/myapp.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/myapp.example.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers on;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
# HSTS - tell browsers to always use HTTPS
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;
location / {
proxy_pass http://backend;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
# Redirect HTTP to HTTPS
server {
listen 80;
server_name myapp.example.com;
return 301 https://$host$request_uri;
}
Advantages of Termination at the Load Balancer
- Centralized certificate management. Update certificates in one place.
- Offload TLS processing. Backends do not spend CPU on encryption/decryption.
- Simplified backend configuration. Backends serve plain HTTP.
- Easier debugging. You can inspect unencrypted traffic between the LB and backends.
mTLS (Mutual TLS)
Standard TLS is one-way: the client verifies the server's identity. Mutual TLS adds the other direction: the server also verifies the client's identity. Both sides present certificates.
Client Server
|--- ClientHello ------------->|
|<-- ServerHello + CertReq----| (server requests client cert)
|--- Client Certificate ------>| (client proves identity)
|<-- Verified + Finished ------|
|<== Encrypted Traffic =======>|
When to Use mTLS
- Service-to-service communication. In a microservice architecture, mTLS ensures that only authorized services can communicate. Service meshes like Istio and Linkerd implement mTLS automatically.
- Zero-trust networks. When you cannot trust the network, mTLS provides identity verification at the transport layer.
- API clients. When you need stronger authentication than API keys for third-party integrations.
mTLS with a Service Mesh
Istio injects a sidecar proxy that handles mTLS transparently:
# Istio PeerAuthentication - require mTLS for all services in namespace
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
With this configuration, every service in the production namespace communicates over mTLS. The application code does not need to change -- the sidecar proxy handles certificate management, rotation, and verification.
Certificate Management in Kubernetes
cert-manager
cert-manager is the standard for automated certificate management in Kubernetes. It obtains, renews, and stores certificates as Kubernetes secrets.
# Install cert-manager issuer for Let's Encrypt
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: nginx
# Request a certificate
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: myapp-tls
namespace: production
spec:
secretName: myapp-tls-secret
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- myapp.example.com
- api.example.com
cert-manager creates the certificate, stores it in the myapp-tls-secret secret, and automatically renews it before expiration. Reference the secret in your Ingress resource's tls section with the cert-manager.io/cluster-issuer annotation.
Caddy Auto-TLS
Caddy obtains and renews certificates automatically with zero configuration. If you use Caddy as your ingress or reverse proxy, TLS is handled out of the box:
myapp.example.com {
reverse_proxy myapp-service:8080
}
That is the entire configuration. Caddy obtains a Let's Encrypt certificate for myapp.example.com and renews it automatically.
What Happens When Your Cert Expires at 2 AM
Certificate expiration is one of the most common causes of production outages. When a certificate expires:
- Browsers show a full-page security warning. Users cannot proceed (and should not).
- API clients reject the connection. Integrations break.
- Service-to-service mTLS connections fail. Internal communication stops.
It happens because of manual processes, silently failing automation (DNS permissions changed, firewall blocked validation), pinned certificates in un-updated clients, or renewal without reloading the load balancer.
Prevention
Automate renewal. Use cert-manager, Caddy auto-TLS, or certbot with cron. Never rely on a human to renew certificates.
Monitor expiration. Alert when a certificate will expire in 14 days. Alert again at 7 days. Page at 3 days.
# Check certificate expiration
echo | openssl s_client -connect myapp.example.com:443 2>/dev/null | \
openssl x509 -noout -dates
# Prometheus blackbox exporter - monitor TLS expiry
modules:
tls_check:
prober: tcp
tcp:
tls: true
# Alert rule
- alert: CertificateExpiringSoon
expr: probe_ssl_earliest_cert_expiry - time() < 7 * 24 * 3600
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate for {{ $labels.instance }} expires in less than 7 days"
Test renewal. Run certbot renew --dry-run regularly to verify the renewal process works. A renewal that has not been tested does not work.
Keep a runbook. When the alert fires at 2 AM, the on-call engineer needs to know exactly how to force-renew and reload:
# Emergency certificate renewal
certbot renew --force-renewal --cert-name myapp.example.com
systemctl reload nginx
# For cert-manager in Kubernetes
kubectl delete secret myapp-tls-secret -n production
# cert-manager will re-issue automatically
Common Pitfalls
- Missing intermediate certificate. The chain is incomplete and some clients reject the connection. Always serve the full chain.
- Not redirecting HTTP to HTTPS. If HTTP still works, users and search engines will use it. Redirect all HTTP traffic to HTTPS.
- Weak TLS configuration. Disable TLS 1.0 and 1.1. Use only strong cipher suites. Test with
ssllabs.com/ssltest. - Not monitoring certificate expiration. Certificates expire silently. Monitor them and alert well before the expiry date.
- Manual certificate processes. If renewing a certificate requires a human to remember and act, it will eventually be forgotten. Automate it.
- Ignoring HSTS. HTTP Strict Transport Security tells browsers to always use HTTPS. Without it, users can be downgraded to HTTP by an attacker.
- Certificate pinning without a rotation plan. Pinning certificates in mobile apps or clients means you must update those clients before the pinned certificate expires. This has caused major outages.
Key Takeaways
- HTTPS everywhere. Use TLS for all external and ideally all internal communication.
- Let's Encrypt provides free certificates. Automate renewal with certbot, cert-manager, or Caddy.
- Terminate TLS at the load balancer for simpler management. Use mTLS for service-to-service authentication.
- cert-manager is the standard for certificate management in Kubernetes. It handles issuance, storage, and renewal automatically.
- Monitor certificate expiration and alert at 14, 7, and 3 days. An expired certificate is a preventable outage.
- Test your renewal process. If you have never tested it, assume it is broken.