DNS: The Internet's Phone Book

Why This Matters

It is Monday morning. Users report that your company website is unreachable. You try pinging it by IP address -- works fine. You try the domain name -- nothing. Your colleague in another office says it works for them. A third person reports it is intermittent.

Welcome to DNS hell.

DNS (Domain Name System) is the infrastructure that translates human-readable domain names like example.com into IP addresses like 93.184.216.34. Every single web request, email delivery, API call, and service discovery starts with a DNS lookup. When DNS breaks, the internet breaks -- or at least your view of it.

DNS problems are some of the most frustrating to diagnose because they are invisible. There is no clear error saying "DNS failed." You just get "host not found" or "connection timed out," and you have to figure out that the real issue is name resolution. Understanding how DNS works -- the hierarchy, the caching, the record types, the resolution process -- turns these mysterious failures into straightforward troubleshooting.


Try This Right Now

# Resolve a domain name to an IP address
dig example.com

# Simpler output -- just the answer
dig +short example.com

# Query a specific DNS server (Google's public DNS)
dig @8.8.8.8 example.com

# See the full resolution chain
dig +trace example.com

# Check your current DNS configuration
cat /etc/resolv.conf

# Check the local hosts file
cat /etc/hosts

How DNS Resolution Works

When you type www.example.com in your browser, here is what happens behind the scenes:

+------------------------------------------------------------------+
|                  DNS RESOLUTION PROCESS                           |
+------------------------------------------------------------------+
|                                                                   |
|  1. Browser/App                                                   |
|     "What is the IP for www.example.com?"                        |
|         |                                                         |
|         v                                                         |
|  2. Local Cache (OS resolver)                                    |
|     "Have I looked this up recently?"                            |
|     If YES --> return cached answer                              |
|     If NO  --> ask recursive resolver                            |
|         |                                                         |
|         v                                                         |
|  3. Recursive Resolver (ISP or 8.8.8.8)                          |
|     "I'll find the answer for you"                               |
|     Checks its own cache first.                                  |
|     If not cached, starts the iterative process:                 |
|         |                                                         |
|         v                                                         |
|  4. Root Name Server (.)                                         |
|     "I don't know www.example.com, but .com is handled           |
|      by these TLD servers: a.gtld-servers.net, ..."              |
|         |                                                         |
|         v                                                         |
|  5. TLD Name Server (.com)                                       |
|     "I don't know www.example.com, but example.com is            |
|      handled by these nameservers: ns1.example.com, ..."         |
|         |                                                         |
|         v                                                         |
|  6. Authoritative Name Server (example.com)                      |
|     "www.example.com is 93.184.216.34"                           |
|         |                                                         |
|         v                                                         |
|  Answer flows back through the chain, getting cached at          |
|  each level along the way.                                       |
|                                                                   |
+------------------------------------------------------------------+

Recursive vs Iterative Queries

Recursive query: The client asks a resolver, and the resolver does ALL the work to find the answer. Your computer makes recursive queries to your DNS resolver. "Find me the answer and come back when you have it."

Iterative query: The server responds with the best answer it has, which might be a referral to another server. The recursive resolver makes iterative queries to root, TLD, and authoritative servers. "I don't know, but ask that server over there."

The DNS Hierarchy

DNS is organized as an inverted tree:

                        . (root)
                       /    \     \
                    .com    .org   .net   .uk   .io  ...
                   /    \
            example.com  google.com
           /     |
     www.example  mail.example

Root servers -- 13 groups of root name servers (named a.root-servers.net through m.root-servers.net), replicated globally via anycast. They know where to find each TLD.

TLD (Top-Level Domain) servers -- Manage domains like .com, .org, .net, .uk. They know which authoritative servers handle each domain within their TLD.

Authoritative servers -- The definitive source for a specific domain. They hold the actual DNS records (A, AAAA, MX, etc.) and give the final answer.

Think About It: Why is DNS hierarchical instead of having one giant database with all domain names? Consider scalability (billions of lookups per day), administration (each organization manages its own records), and resilience (no single point of failure).


DNS Record Types

DNS stores different types of records for different purposes. Here are the ones you will encounter most often:

A Record (Address)

Maps a domain name to an IPv4 address. The most fundamental record type.

dig A example.com +short
# 93.184.216.34

AAAA Record (IPv6 Address)

Maps a domain name to an IPv6 address. The name "AAAA" is because IPv6 addresses are four times the length of IPv4 addresses.

dig AAAA example.com +short
# 2606:2800:220:1:248:1893:25c8:1946

CNAME Record (Canonical Name)

An alias that points one domain name to another. When you query a CNAME, the resolver follows it to the target name and returns that IP.

dig CNAME www.example.com +short
# example.com
# (www.example.com is an alias for example.com)

Important: A CNAME cannot coexist with other record types at the same name. You cannot have both a CNAME and an MX record for example.com. This is a common source of misconfigurations.

MX Record (Mail Exchanger)

Specifies which mail servers handle email for a domain. Includes a priority number (lower number = higher priority).

dig MX example.com +short
# 10 mail.example.com.
# 20 mail2.example.com.

If the server with priority 10 is unreachable, email is delivered to priority 20.

NS Record (Name Server)

Identifies the authoritative DNS servers for a domain.

dig NS example.com +short
# a.iana-servers.net.
# b.iana-servers.net.

TXT Record (Text)

Holds arbitrary text data. Commonly used for SPF (email sender verification), DKIM (email signing), domain verification, and ACME challenges (Let's Encrypt).

dig TXT example.com +short
# "v=spf1 -all"

PTR Record (Pointer / Reverse DNS)

Maps an IP address back to a domain name. The reverse of an A record. Used for reverse DNS lookups.

dig -x 93.184.216.34 +short
# (returns the hostname associated with that IP)

SOA Record (Start of Authority)

Contains administrative information about a DNS zone: the primary nameserver, the administrator's email, serial number, and timing parameters for caching and zone transfers.

dig SOA example.com +short
# sns.dns.icann.org. noc.dns.icann.org. 2022091303 7200 3600 1209600 3600

Fields: primary NS, admin email (@ replaced by .), serial, refresh, retry, expire, minimum TTL.

Complete Record Type Reference

+--------+-------------------+-----------------------------------+
| Type   | Name              | Purpose                           |
+--------+-------------------+-----------------------------------+
| A      | Address           | Domain -> IPv4 address            |
| AAAA   | IPv6 Address      | Domain -> IPv6 address            |
| CNAME  | Canonical Name    | Alias -> another domain name      |
| MX     | Mail Exchanger    | Mail server for the domain        |
| NS     | Name Server       | Authoritative DNS servers         |
| TXT    | Text              | SPF, DKIM, verification, etc.     |
| PTR    | Pointer           | IP address -> domain (reverse)    |
| SOA    | Start of Authority| Zone admin info and parameters    |
| SRV    | Service           | Service discovery (host:port)     |
| CAA    | Cert Authority    | Which CAs can issue certs         |
+--------+-------------------+-----------------------------------+

The dig Command: Deep Dive

dig (Domain Information Groper) is the most important DNS troubleshooting tool. It is flexible, detailed, and reveals exactly what is happening.

Basic Usage

# Simple query
dig example.com

# Query a specific record type
dig MX example.com

# Query a specific DNS server
dig @8.8.8.8 example.com

# Short output (just the answer)
dig +short example.com

# No extra sections (cleaner output)
dig +noall +answer example.com

Reading Full dig Output

dig example.com
; <<>> DiG 9.18.1 <<>> example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12345
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;example.com.                   IN      A

;; ANSWER SECTION:
example.com.            86400   IN      A       93.184.216.34

;; Query time: 23 msec
;; SERVER: 127.0.0.53#53(127.0.0.53) (UDP)
;; WHEN: Sat Feb 21 10:00:00 UTC 2026
;; MSG SIZE  rcvd: 56

Breaking this down:

SectionWhat It Tells You
status: NOERRORQuery was successful. NXDOMAIN means the domain does not exist. SERVFAIL means the server had an error.
flags: qr rd raqr = query response, rd = recursion desired, ra = recursion available
QUESTION SECTIONWhat was asked (A record for example.com)
ANSWER SECTIONThe answer: IP 93.184.216.34, TTL 86400 seconds (24 hours)
Query timeHow long the lookup took
SERVERWhich DNS server answered

Advanced dig Usage

# Trace the full resolution path (shows each step)
dig +trace example.com

# Check all record types
dig ANY example.com

# Reverse DNS lookup
dig -x 93.184.216.34

# Check a specific nameserver's records
dig @ns1.example.com example.com

# Check if a response is authoritative
dig +noall +answer +authority example.com

# Check the TTL (Time To Live) -- how long the record is cached
dig +noall +answer example.com
# The number after the domain name is the TTL in seconds

# Batch queries from a file
echo -e "example.com\ngoogle.com\ngithub.com" > domains.txt
dig -f domains.txt +short

Hands-On: Tracing DNS Resolution

# Watch the entire resolution chain
dig +trace www.google.com

This shows you exactly which servers were queried at each level:

.                       518400  IN  NS  a.root-servers.net.
.                       518400  IN  NS  b.root-servers.net.
(... root servers ...)
;; Received 525 bytes from 127.0.0.53#53

com.                    172800  IN  NS  a.gtld-servers.net.
com.                    172800  IN  NS  b.gtld-servers.net.
(... .com TLD servers ...)
;; Received 734 bytes from 198.41.0.4#53 (a.root-servers.net)

google.com.             172800  IN  NS  ns1.google.com.
google.com.             172800  IN  NS  ns2.google.com.
(... Google's nameservers ...)
;; Received 836 bytes from 192.5.6.30#53 (a.gtld-servers.net)

www.google.com.         300     IN  A   142.250.80.100
;; Received 48 bytes from 216.239.32.10#53 (ns1.google.com)

You can see the query going from root -> .com TLD -> google.com authoritative -> answer.


nslookup: The Quick Alternative

nslookup is simpler than dig but less detailed. It is available on almost every system, including Windows.

# Basic lookup
nslookup example.com

# Specify a DNS server
nslookup example.com 8.8.8.8

# Lookup a specific record type
nslookup -type=mx example.com
nslookup -type=ns example.com
nslookup -type=txt example.com

# Reverse lookup
nslookup 93.184.216.34

DNS Configuration on Linux

/etc/resolv.conf

This file tells your system which DNS servers to use:

cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 8.8.8.8
nameserver 8.8.4.4
search example.com internal.example.com
DirectiveMeaning
nameserverIP address of a DNS resolver (up to 3)
searchDomains to append when searching short names
domainDefault domain (older alternative to search)

The search directive means that if you query webserver, the resolver will try webserver.example.com and then webserver.internal.example.com before giving up.

WARNING: On many modern systems, /etc/resolv.conf is managed automatically by systemd-resolved, NetworkManager, or dhclient. Editing it directly may work temporarily but get overwritten. Check if the file is a symlink: ls -la /etc/resolv.conf

/etc/hosts

The hosts file provides static name-to-IP mappings that bypass DNS entirely. It is checked BEFORE DNS servers.

cat /etc/hosts
127.0.0.1       localhost
127.0.1.1       myhostname
::1             localhost ip6-localhost

# Custom entries
192.168.1.10    dbserver.internal  dbserver
192.168.1.20    webserver.internal webserver

Use cases for /etc/hosts:

  • Testing a website before DNS is configured
  • Blocking domains (point them to 127.0.0.1)
  • Local development (mapping myapp.local to 127.0.0.1)
  • Servers that need to resolve each other without DNS

/etc/nsswitch.conf

This file controls the order in which name resolution sources are consulted:

grep hosts /etc/nsswitch.conf
hosts:          files dns mymachines

This means: check /etc/hosts first (files), then DNS (dns). Changing the order changes the behavior.

Think About It: A developer adds 192.168.1.99 api.production.com to their /etc/hosts for testing. They forget about it. Months later, the production API server changes its IP. Why does their application break while everyone else's works fine?


systemd-resolved: Modern DNS Management

Many modern Linux distributions use systemd-resolved as the local DNS resolver. It provides caching, DNSSEC validation, and per-interface DNS configuration.

# Check if systemd-resolved is running
systemctl status systemd-resolved

# See current DNS configuration
resolvectl status

# Query using systemd-resolved directly
resolvectl query example.com

# See DNS statistics (cache hits, misses)
resolvectl statistics

# Flush the DNS cache
resolvectl flush-caches
# Or:
sudo systemd-resolve --flush-caches

How systemd-resolved Works

+----------------------------------------------------------+
|                                                          |
|  Application                                             |
|      |                                                   |
|      v                                                   |
|  NSS (checks /etc/nsswitch.conf)                        |
|      |                                                   |
|      v                                                   |
|  systemd-resolved  (127.0.0.53:53)                      |
|  +----------------------------------------------------+ |
|  | Local cache                                         | |
|  | DNSSEC validation                                   | |
|  | Per-link DNS configuration                          | |
|  +----------------------------------------------------+ |
|      |                                                   |
|      v                                                   |
|  Upstream DNS servers                                    |
|  (configured per-interface via DHCP or manually)        |
|                                                          |
+----------------------------------------------------------+

When systemd-resolved is active, /etc/resolv.conf typically contains:

nameserver 127.0.0.53

This points to the local stub resolver. The actual upstream DNS servers are managed by systemd-resolved and visible via resolvectl status.

Distro Note: systemd-resolved is default on Ubuntu 18.04+ and Fedora. It is optional on Arch and not typically used on Debian (though available). On systems without systemd-resolved, DNS is configured directly in /etc/resolv.conf, often managed by NetworkManager or dhclient.


DNS Caching and TTL

Every DNS record has a TTL (Time To Live) value, specified in seconds. This tells resolvers how long they can cache the record before they must query again.

# See the TTL in dig output
dig +noall +answer example.com
# example.com.    86400   IN  A  93.184.216.34
#                 ^^^^^
#                 TTL: 86400 seconds = 24 hours

# Query again -- the TTL will count down
dig +noall +answer example.com
# example.com.    85200   IN  A  93.184.216.34
#                 ^^^^^
#                 TTL has decreased (time passed since last query)

Why TTL Matters

  • High TTL (e.g., 86400 = 24h): Less DNS traffic, faster resolution from cache. But changes take up to 24 hours to propagate.
  • Low TTL (e.g., 60 = 1 min): Changes propagate quickly. But more DNS queries, slightly slower resolution.

When you are planning a DNS change (like migrating to a new server), lower the TTL days in advance so that when you make the change, the old cached records expire quickly.

Flushing DNS Caches

When you make a DNS change and need it to take effect immediately on your machine:

# systemd-resolved
resolvectl flush-caches

# nscd (Name Service Cache Daemon)
sudo systemctl restart nscd

# dnsmasq (if used as local resolver)
sudo systemctl restart dnsmasq

# There is no universal "flush DNS" on Linux -- it depends on
# what resolver you are running

Debug This

Users report that app.internal.example.com resolves to the wrong IP address on some servers but works correctly on others.

Step 1: Check what each server resolves:

# On the broken server
dig +short app.internal.example.com
# 10.0.0.50  (wrong -- old IP)

# On a working server
dig +short app.internal.example.com
# 10.0.1.100  (correct -- new IP)

Step 2: Check if the answer is cached:

# Query the authoritative server directly
dig +short @ns1.example.com app.internal.example.com
# 10.0.1.100  (authoritative answer is correct)

Step 3: The broken server has a stale cached answer. Check where it gets DNS:

resolvectl status
# Or:
cat /etc/resolv.conf

Step 4: Flush the cache:

resolvectl flush-caches

Step 5: Check /etc/hosts:

grep app.internal /etc/hosts
# 10.0.0.50  app.internal.example.com    <-- Found it!

Someone added a static entry in /etc/hosts months ago and forgot about it. Since files comes before dns in /etc/nsswitch.conf, the hosts file entry takes precedence over DNS.

Fix: Remove the stale entry from /etc/hosts.


What Just Happened?

+------------------------------------------------------------------+
|                        CHAPTER RECAP                              |
+------------------------------------------------------------------+
|                                                                   |
|  DNS translates domain names to IP addresses.                    |
|                                                                   |
|  Resolution path: Local cache -> Recursive resolver ->           |
|    Root server -> TLD server -> Authoritative server             |
|                                                                   |
|  Key record types:                                               |
|    A (IPv4), AAAA (IPv6), CNAME (alias), MX (mail),             |
|    NS (nameserver), TXT (text), PTR (reverse), SOA (admin)      |
|                                                                   |
|  Essential commands:                                              |
|    dig domain          Full DNS query                            |
|    dig +short domain   Just the answer                           |
|    dig +trace domain   Full resolution chain                     |
|    dig @server domain  Query specific server                     |
|                                                                   |
|  Configuration:                                                   |
|    /etc/resolv.conf    DNS server settings                       |
|    /etc/hosts          Static name mappings (checked first)      |
|    /etc/nsswitch.conf  Resolution order                          |
|                                                                   |
|  TTL controls caching. Lower TTL before DNS changes.             |
|                                                                   |
+------------------------------------------------------------------+

Try This

  1. Record Type Exploration: Use dig to query every record type (A, AAAA, MX, NS, TXT, SOA) for google.com. What do you learn about Google's infrastructure?

  2. Trace Resolution: Run dig +trace example.com and dig +trace github.com. Compare the resolution paths. How many levels of nameservers does each pass through?

  3. Local Overrides: Add a temporary entry in /etc/hosts that maps testsite.local to 127.0.0.1. Verify it works with ping testsite.local. Then remove it.

  4. DNS Server Comparison: Query the same domain using different DNS servers and compare the results and response times:

    dig @8.8.8.8 example.com       # Google
    dig @1.1.1.1 example.com       # Cloudflare
    dig @9.9.9.9 example.com       # Quad9
    
  5. Reverse DNS: Find the PTR record for several well-known IP addresses:

    dig -x 8.8.8.8
    dig -x 1.1.1.1
    
  6. Bonus Challenge: Set up a local DNS cache using dnsmasq or unbound. Configure your system to use it as the primary resolver. Measure the performance improvement by timing repeated queries with dig before and after caching.