6 min read
On this page

DNS Fundamentals

DNS (Domain Name System) translates human-readable names into IP addresses. It is one of the oldest and most critical parts of internet infrastructure. Every web request, API call, and email delivery starts with a DNS lookup. Understanding DNS is not optional for anyone who operates production systems.

How DNS Resolution Works

When a browser requests myapp.example.com, the resolution follows a chain:

Browser Cache -> OS Cache -> Recursive Resolver -> Root Server
                                                -> TLD Server (.com)
                                                -> Authoritative Server (example.com)
                                                -> Answer: 93.184.216.34
  1. The browser checks its own cache.
  2. The OS resolver checks its local cache.
  3. If not cached, the recursive resolver (your ISP or 8.8.8.8) queries the DNS hierarchy.
  4. The root server directs to the .com TLD server.
  5. The TLD server directs to the authoritative nameserver for example.com.
  6. The authoritative server returns the IP address.
  7. The result is cached at every level according to the TTL.

Record Types

A Record

Maps a hostname to an IPv4 address. The most common record type.

myapp.example.com.    300    IN    A    93.184.216.34

Multiple A records for the same name enable round-robin DNS load balancing:

myapp.example.com.    300    IN    A    93.184.216.34
myapp.example.com.    300    IN    A    93.184.216.35
myapp.example.com.    300    IN    A    93.184.216.36

AAAA Record

Maps a hostname to an IPv6 address. Identical purpose to A records but for the IPv6 address space.

myapp.example.com.    300    IN    AAAA    2606:2800:220:1:248:1893:25c8:1946

CNAME Record

An alias that points one hostname to another. The DNS resolver follows the chain to get the final IP.

www.example.com.      300    IN    CNAME    myapp.example.com.
myapp.example.com.    300    IN    A        93.184.216.34

A request for www.example.com resolves to myapp.example.com, which resolves to 93.184.216.34.

Restrictions:

  • A CNAME cannot coexist with other record types for the same name.
  • A CNAME cannot be set on the zone apex (example.com itself). Use an ALIAS or ANAME record if your provider supports it, or use A records directly.

MX Record

Mail exchange records. Specifies which mail servers accept email for a domain.

example.com.    300    IN    MX    10    mail1.example.com.
example.com.    300    IN    MX    20    mail2.example.com.

The number is priority -- lower values are tried first. If mail1 is down, email goes to mail2.

TXT Record

Holds arbitrary text. Used for domain verification, email authentication, and other metadata.

example.com.    300    IN    TXT    "v=spf1 include:_spf.google.com ~all"
example.com.    300    IN    TXT    "google-site-verification=abc123"

Common uses:

  • SPF: Specifies which servers can send email for your domain.
  • DKIM: Publishes the public key for email signing.
  • DMARC: Policy for handling failed SPF/DKIM checks.
  • Domain verification: Proves domain ownership to services like Google, AWS, and certificate authorities.

NS Record

Delegates a domain or subdomain to specific nameservers.

example.com.    86400    IN    NS    ns1.dnsprovider.com.
example.com.    86400    IN    NS    ns2.dnsprovider.com.

SRV Record

Specifies the location (hostname and port) of a service. Used by some protocols and service discovery systems.

_sip._tcp.example.com.    300    IN    SRV    10 60 5060 sipserver.example.com.

TTL (Time to Live)

TTL is the number of seconds a DNS record should be cached. When a resolver fetches a record, it stores it for the TTL duration and does not query again until the TTL expires.

myapp.example.com.    300    IN    A    93.184.216.34
                      ^^^
                      TTL: 300 seconds (5 minutes)

TTL Strategy

  • Normal operations: 300 to 3600 seconds (5 minutes to 1 hour). Reduces query load on your nameservers.
  • Before a migration: Lower the TTL to 60 seconds, wait for the old TTL to expire, then make the change. This ensures clients pick up the new record quickly.
  • During an incident: If you need to redirect traffic immediately, a low TTL means clients update within seconds.
# Before migration: lower TTL (do this 24-48 hours before the change)
myapp.example.com.    60    IN    A    93.184.216.34

# Make the change
myapp.example.com.    60    IN    A    10.0.1.50

# After migration stabilizes: raise TTL back
myapp.example.com.    3600    IN    A    10.0.1.50

Why Lowering TTL Takes Time

If your current TTL is 3600 seconds, resolvers cached the old record for up to an hour. Lowering the TTL to 60 does not take effect until those caches expire. That is why you lower the TTL in advance -- wait for the old TTL to elapse, then make the actual change.

DNS Propagation

"DNS propagation" is the time it takes for a record change to be visible everywhere. It is not a single event -- different resolvers update at different times based on their cache state.

Factors that affect propagation:

  • TTL of the old record: Resolvers honor the TTL they cached.
  • Resolver behavior: Some resolvers (ISPs, corporate proxies) ignore TTL and cache longer.
  • Negative caching: If a record did not exist and was queried, the negative result is cached (usually for the SOA minimum TTL).
# Check what different resolvers see
dig myapp.example.com @8.8.8.8       # Google DNS
dig myapp.example.com @1.1.1.1       # Cloudflare DNS
dig myapp.example.com @208.67.222.222 # OpenDNS

DNS Providers

Cloudflare

Free tier with global anycast network. Fast propagation, built-in DDoS protection, and a proxy mode that hides your origin IP. The default choice for many projects.

Route 53 (AWS)

Deeply integrated with AWS services. Supports health checks, weighted routing, latency-based routing, and failover. The natural choice if you run on AWS.

Google Cloud DNS

Managed DNS on Google's infrastructure. 100% SLA. Simple and reliable but fewer routing features than Route 53.

DNS as a Traffic Routing Tool

DNS is not just for name resolution. It is a traffic management layer.

GeoDNS

Return different IP addresses based on the client's geographic location. Route European users to European servers, Asian users to Asian servers.

# Route 53 geolocation routing
myapp.example.com -> US users -> 10.0.1.50 (us-east)
myapp.example.com -> EU users -> 10.1.1.50 (eu-west)
myapp.example.com -> Default  -> 10.0.1.50 (us-east)

Weighted Routing

Distribute traffic across multiple endpoints by percentage. Useful for canary deployments or gradual migrations.

# Route 53 weighted routing
myapp.example.com -> Weight 90 -> 10.0.1.50 (current)
myapp.example.com -> Weight 10 -> 10.0.2.50 (canary)

Failover DNS

Return a healthy endpoint. If the primary fails a health check, DNS automatically returns the secondary.

# Route 53 failover
myapp.example.com -> Primary (health check: healthy)  -> 10.0.1.50
myapp.example.com -> Secondary (standby)               -> 10.0.2.50

When the health check for the primary fails, Route 53 automatically returns the secondary IP.

Latency-Based Routing

Return the endpoint with the lowest latency to the client, measured by AWS's network data.

# Route 53 latency routing
myapp.example.com -> us-east-1 -> 10.0.1.50
myapp.example.com -> eu-west-1 -> 10.1.1.50
myapp.example.com -> ap-southeast-1 -> 10.2.1.50

Common DNS Mistakes

Not Lowering TTL Before Changes

Changing an A record with a 1-hour TTL means some clients will use the old IP for up to an hour after the change. Lower the TTL in advance.

CNAME at the Zone Apex

You cannot set a CNAME on example.com (the bare domain). This is a DNS specification limitation. Use an ALIAS/ANAME record (provider-specific) or an A record.

Forgetting the Trailing Dot

In DNS zone files, example.com. (with a trailing dot) is a fully qualified domain name. Without the dot, the zone name is appended, which can produce example.com.example.com.

Misconfigured Email Records

Missing or incorrect SPF, DKIM, or DMARC records cause email to land in spam or be rejected entirely. Validate your email DNS records:

dig TXT example.com | grep spf
dig TXT _dmarc.example.com

Not Monitoring DNS

DNS failures are invisible until users cannot reach your service. Monitor DNS resolution from multiple locations.

# Simple DNS monitoring check
dig +short myapp.example.com @8.8.8.8 || echo "DNS resolution failed"

Common Pitfalls

  • High TTL during migrations. Always lower TTL 24-48 hours before making DNS changes.
  • Relying on DNS for instant failover. DNS propagation is not instant. Even with a 60-second TTL, some clients take longer.
  • Too many CNAME chains. Each CNAME adds a lookup. a -> b -> c -> d means four queries. Keep chains short.
  • Not using multiple nameservers. If your single nameserver goes down, your entire domain is unreachable. Use at least two, preferably from different providers.
  • Ignoring DNSSEC. DNSSEC prevents DNS spoofing attacks. Enable it if your provider supports it, especially for sensitive domains.
  • Wildcard records catching mistakes. A *.example.com record means every nonexistent subdomain resolves instead of returning NXDOMAIN. This can mask typos and misconfigurations.

Key Takeaways

  • DNS maps names to IP addresses. Understand A, AAAA, CNAME, MX, and TXT records -- you will configure all of them.
  • TTL controls caching duration. Lower it before changes, raise it after.
  • DNS propagation is not instant. Plan for it during migrations and failovers.
  • Use DNS for traffic routing: GeoDNS for geographic distribution, weighted routing for canary deploys, failover for high availability.
  • Validate your DNS configuration regularly. Misconfigured records cause subtle, hard-to-debug problems.
  • Monitor DNS resolution from external locations. A DNS outage affects everything downstream.