Domain Name System

Everything you do online begins with a question you never see.

Before the browser downloads a single image, before the mail leaves, before the app syncs anything at all, someone has to answer a silent request: "this name, where does it live?". You type centamori.com; the machine needs 172.66.147.243. In between sits a system that answers that question billions of times a second, in a handful of milliseconds, with a reliability we take for granted right up until the day it breaks — and when it breaks, half the Internet seems to vanish.

That system is DNS (Domain Name System). It is the closest thing we have to a telephone directory for the planet, except there is no directory, no central office, nobody who owns it. It is a database distributed across millions of machines, designed in 1983, that no one controls in full and everyone trusts.

In this article we take it apart. We will see where it comes from, how the tree of names is organised, who the actors are that pass the question along, and then we will descend to the bytes: we will build a DNS query from scratch, send it onto the wire, and read the answer one byte at a time.


From a single text file

In the beginning there was no DNS. There was a file.

It was called HOSTS.TXT, and it was exactly what it sounds like: a text list mapping every host name to its numeric address. A single entity maintained it by hand — the Stanford Research Institute, SRI-NIC — and anyone, across the whole ARPANET, periodically downloaded that file and kept it locally. You wanted to reach a machine by name? Your computer looked up the matching line in HOSTS.TXT.

For a few hundred hosts, it worked. Then the network began to grow, and the model crumbled under its own weight. A single file updated by hand by a single team; every new machine that had to be reported to the centre; every host in the world re-downloading the entire list to keep up. It was not a software problem, it was a problem of shape: a centralised, flat structure does not scale. Ever.

Jon Postel, one of Internet's founding figures, asked a researcher at ISI to imagine something different. That researcher was Paul Mockapetris, and in 1983 he published two documents — RFC 882 and RFC 883 — describing a distributed, hierarchical name system built on two ideas that still hold today: delegation and authority. He also wrote the first implementation, a server called Jeeves. On 23 June 1983, together with Postel, he ran the first test of DNS.

In 1984 some students at Berkeley wrote BIND, the implementation that would dominate the decades to follow. And in November 1987, after four years of experimentation on a real network, Mockapetris rewrote everything in RFC 1034 and RFC 1035 — "Concepts and Facilities" and "Implementation and Specification". Those two documents are still today the foundation on which every DNS implementation is built.

The insight that changed everything is this: nobody needs to know everything. It is enough that everyone knows who to ask.


The upside-down tree

DNS organises names as an inverted tree, with the root at the top.

                    . (root)
          ┌─────────┼─────────┐
         com       org        it        ← Top-Level Domains
          │                    │
      centamori              comune      ← domains
          │                    │
        www                  perugia     ← subdomains

When you write www.centamori.com you read it left to right, but DNS reads it the other way around: from the root (the rightmost part, implied), down to com, then to centamori, then to www. In fact the full, correct name would be www.centamori.com. — with a trailing dot representing the root. That dot is always there, even when you do not write it.

Each level of the tree is governed by someone different, and this is where delegation becomes concrete:

  • The root is run by a handful of organisations under the coordination of IANA/ICANN. It knows one thing only: who is responsible for each TLD.
  • The TLDs (com, org, it, ....) are run by registries. The `.com` registry knows nothing about the contents of `centamori.com`; it only knows which servers to ask.
  • The domain centamori.com is managed by you — or by whoever hosts your DNS zone. This is where the real records live.

This portion of the tree entrusted to a single authority is called a zone. The boundary between one zone and the one below it is a delegation: the parent does not contain the child's data, it contains only a pointer — "for this branch, ask those servers". No node knows the entire tree. Each one knows only its own piece and the pointers down to the level below.


The actors

As with every protocol, before looking at the bytes it helps to know who is on stage. A DNS resolution involves four kinds of actor.

1) The stub resolver

It is the piece of software inside your operating system. It cannot resolve anything on its own: its only skill is to pass the question to someone else and wait for the answer. When the browser needs to reach a site, it is the stub resolver that sets off.

2) The recursive resolver

It is the real workhorse. It is the server the stub turns to — your provider's, or a public resolver such as Google's 8.8.8.8 or Cloudflare's 1.1.1.1. The recursive resolver accepts the question and takes on the entire burden of finding the answer, walking the tree on your behalf. Above all, it caches what it finds: this is the reason DNS does not implode under its own traffic.

3) The root servers

They are the thirteen addresses (in reality thousands of physical machines, distributed via anycast across the planet) sitting at the top of the tree. They do not know your domain. They only know how to point to the TLD servers.

4) The authoritative servers

They are the ones that hold the real data of a zone. The authoritative server for centamori.com is the only one that can say with certainty what that domain's address is. Everyone else, at best, holds a copy in cache with an expiry date.


Resolution: a treasure hunt

Suppose the recursive resolver has nothing in cache and has to resolve www.centamori.com from scratch. The path is a descent down the tree, one question per level:

  1. It asks a root server: "who runs com?". The root does not know centamori, but it answers with a referral: "go ask the com servers".
  2. It asks a .com server: "who runs centamori.com?". This one does not know the final address either, but it knows the delegation: "here are the authoritative servers for centamori.com".
  3. It asks the authoritative server: "what is the A address of www.centamori.com?". This one knows, and answers with the address.

Three questions, three steps down the tree. Note the difference between the two words that are forever confused: recursion is what the resolver does for you — "resolve all of this for me and come back with the final answer"; iteration is what the resolver does towards the servers of the hierarchy, which do not resolve on its behalf but only answer with a referral to the next level.

The second rung of this ladder we can see directly. Let us ask a resolver which servers are authoritative for the com TLD:

com    NS    a.gtld-servers.net    TTL 21600
com    NS    b.gtld-servers.net    TTL 21600
com    NS    c.gtld-servers.net    TTL 21600
...
com    NS    m.gtld-servers.net    TTL 21600

Thirteen servers, from a to m, holding up the entire .com name space. It is to one of them that every resolution of a .com domain must pass — unless, and here is the trick, the answer is already in cache somewhere along the chain.

TTL, or why all this works

If every visit to every site really made the full round trip up to the root servers, DNS would have collapsed decades ago. It does not, thanks to a number attached to every answer: the TTL (Time To Live), the seconds for which that answer may be kept in cache before being considered stale.

That TTL 21600 means the .com servers can stay cached for six hours. A large site's address might have a TTL of a few minutes — so it can switch servers quickly — while rarely touched records can last a day. The TTL is DNS's constant trade-off between freshness and load: the higher it is, the less traffic, but the slower changes propagate. This is why, when you move a site, they tell you to "wait for DNS propagation": you are simply waiting for the caches of half the world to let the old value expire.


The protocol on the wire

Here DNS parts ways with a protocol like SMTP. SMTP is text: you can read it with your eyes. DNS is binary, compact, designed to fit in a single UDP packet and come back in the fewest possible bytes. You do not read it: you decode it.

A DNS message has a fixed structure, identical for questions and answers, divided into five parts:

Section Contents
Header 12 bytes: ID, flags, and four counters
Question The question: name, record type, class
Answer The answer records
Authority The authoritative servers (used in referrals)
Additional Extra data, such as the addresses of the servers cited above

The header is twelve bytes and not one more. The first two are a random identifier that ties the answer to the question. The next two are the flags — one bit says whether it is a question or an answer, another asks for recursion, others carry the error code. The last eight bytes are four 16-bit counters: how many questions, how many answers, how many authority records, how many additional.

Building a query from scratch

The best way to understand a binary format is to generate it by hand. Let us build, in PHP, a query for the address of example.com, with no libraries, writing the bytes one by one.

<?php

function build_query(string $domain, int $id): string {
    $flags  = 0x0100;
    $header = pack('nnnnnn', $id, $flags, 1, 0, 0, 0);

    $question = '';
    foreach (explode('.', $domain) as $label) {
        $question .= chr(strlen($label)) . $label;
    }
    $question .= "\x00";
    $question .= pack('nn', 1, 1);

    return $header . $question;
}

Two details deserve attention. The flag 0x0100 has a single bit set, the recursion desired one: we are asking a recursive resolver to do the work for us. And the name is not written as example.com: each label is preceded by its length. example becomes the byte 7 followed by the seven characters, com becomes 3 followed by three characters, and a final zero marks the root. The two trailing pack('nn', 1, 1) declare that we want a record of type A (value 1), class IN, Internet (value 1).

Running it and printing the packet in hexadecimal gives exactly 29 bytes:

fc12 0100 0001 0000 0000 0000
07 6578616d706c65   03 636f6d   00
0001 0001

And every byte is readable, if you know what to look for:

  • fc12 — the request's random ID.
  • 0100 — the flags: recursion desired.
  • 0001 — one question. The three following counters are zero: no answers, it is a query.
  • 07 example 03 com 00 — the name, label by label, closed by the root.
  • 0001 0001 — type A, class IN.

Sending it and reading the answer

We open a UDP socket towards a public resolver on port 53, write the packet, read the response. Then the interesting part: interpreting the bytes that come back.

$sock = fsockopen('udp://8.8.8.8', 53, $errno, $errstr, 3);
fwrite($sock, build_query('example.com', random_int(0, 0xFFFF)));
$response = fread($sock, 512);
fclose($sock);

$header = unpack('nid/nflags/nqd/nan/nns/nar', substr($response, 0, 12));
echo "Answers: {$header['an']}\n";
Answers: 2
A    172.66.147.243    TTL 300
A    104.20.23.154     TTL 300

Two addresses for example.com, each valid for 300 seconds. The resolver walked the tree for us — root, .com, authoritative — and handed back the result, most likely already waiting in cache.

The hidden pointer

There is an elegance to the format worth revealing. In the response, the domain name would appear twice: once in the question, once in each record. Repeating it would be a waste of precious bytes in a packet that wants to stay under 512. So DNS uses name compression: instead of rewriting example.com, a record can say "the name is over there, at offset X of the packet", with a two-byte pointer recognisable because its first two bits are set to one.

if (($len & 0xC0) === 0xC0) {
    $pointer = (($len & 0x3F) << 8) | ord($msg[$offset + 1]);
    return read_name($msg, $pointer);
}

It is a tiny detail, but it is the difference between a protocol that was designed and one that was improvised. Every byte counts, and the format knows it.

UDP, and when it is not enough

DNS lives on UDP because a resolution has to be fast and lightweight: one packet out, one in, no handshake. The historical price was a 512-byte limit per response. When a response does not fit — many records, or a DNSSEC signature — two mechanisms come into play: EDNS0, which negotiates larger UDP packets, and the fallback to TCP, where size is no longer a problem. The practical rule remains: UDP for the vast majority of queries, TCP when the response is too big for a single packet.


The zoo of records

DNS does not only translate names into addresses. Every name can have records of different types associated with it, each with its own purpose. The most important:

A      IPv4 address                         example.com -> 172.66.147.243
AAAA   IPv6 address                         google.com  -> 2607:f8b0:4001:c05::8b
CNAME  alias to another name                www.github.com -> github.com
MX     mail server, with priority           gmail.com   -> 5 gmail-smtp-in.l.google.com
NS     authoritative servers of the zone    com         -> a.gtld-servers.net
TXT    free text (SPF, DKIM, verification)  v=spf1 include:...
SOA    zone parameters (TTL, serial)        ...

Three of these tell a story. The CNAME is a pure alias: you ask for www.github.com and DNS answers "it is actually github.com, start again from there" — and that is exactly what happens on the wire, a response that contains first the CNAME and then the address of the name it points to.

The MX is the bridge to another protocol. When a mail server has to deliver an email, it is the MX record it looks for, not the A. The answer for gmail.com arrives with priorities:

gmail.com   MX   5   gmail-smtp-in.l.google.com
gmail.com   MX   10  alt1.gmail-smtp-in.l.google.com
gmail.com   MX   20  alt2.gmail-smtp-in.l.google.com

The lowest number takes precedence; the others are the fallbacks. It is the same redundancy that, in another article, I described as one of the reasons email is so resilient — and DNS is what makes it possible.

The TXT, finally, is the bag where the ecosystem stuffed everything it needed without inventing new formats: mail's anti-spoofing policies (SPF, DKIM, DMARC) live here, as text strings inside DNS. A protocol born to translate names became, over time, also the place where you declare who is allowed to speak on whose behalf.


The gigantic flaw

If you have followed this far, a question should already have surfaced. All of this travels over UDP, in the clear, with no signature whatsoever. When the resolver receives a response, how does it know it really comes from who it claims to come from?

The answer, in the original DNS, is uncomfortable: it does not know.

The only defence of the 1983 protocol was that random 16-bit ID in the header, plus the matching of the port. If an attacker managed to guess them and send a fake response before the real one, the resolver would accept it — and, worse, would cache it, serving every one of its users a bogus address. This is cache poisoning, and in 2008 Dan Kaminsky showed it was far easier to pull off than anyone thought. For a moment, the entire Internet was afraid.

And there is a second problem, quieter: privacy. Every name you resolve travels in the clear. Your provider, anyone sharing the café's wifi, anyone on the path, sees the complete list of the sites you visit — even when the real connection is then encrypted with HTTPS. The envelope is armoured, but the address on the envelope is written in marker pen.

As with SMTP, the solution was not to rewrite the protocol, but to build superstructures around it:

  1. DNSSEC tackles authenticity. Every record is cryptographically signed, and the signatures form a chain of trust that climbs all the way to the root: the authoritative signs its records, the TLD signs the authoritative, the root signs the TLD. If even a single link does not add up, the response is considered tampered with. It encrypts nothing — anyone can still read your queries — but it guarantees the answer is not forged.
  2. DNS over TLS (DoT) and DNS over HTTPS (DoH) tackle privacy, encrypting the transport between you and the resolver. Whoever sits in the middle no longer sees the names you ask for. The trade-off, much debated, is centralisation: funnelling DNS queries inside HTTPS towards a few large resolvers shifts an enormous amount of knowledge about who visits what into the hands of a few.

None of these is universal, and adoption is still partial. But the direction is clear: to turn a protocol born in an age of implicit trust into something that can survive in an age that has no trust left.


The silent substrate

DNS is the part of the Internet you think about least and depend on most.

It was born from a text file that did not scale, and it became one of the largest distributed systems ever built — with no centre, no owner, no single machine that has to know everything. Its whole elegance lies in that idea Mockapetris had in 1983: to break an impossible problem (knowing every name in the world) into a chain of trivial ones (knowing who to ask). Every node is ignorant of almost everything, and precisely for that reason the whole works.

Next time a site opens before you have even finished pressing enter, remember that behind that apparent nothing there was a little treasure hunt down an upside-down tree, concluded in a few milliseconds by a system that learned names when the computers connected to the world were counted in the hundreds.

The systems that last are not the cleverest ones. They are the ones that guessed the right shape — and then had the good sense not to touch it.