← All posts
3 min read
Design a URL shortener (the interview classic, done properly)
#system-design#architecture
The URL shortener is the "hello world" of system design — small enough to finish in 45 minutes, deep enough to show real trade-offs.
1. Requirements first, always
- Shorten a long URL →
sho.rt/Ab3xK9 - Redirect fast (this is 99% of traffic)
- Scale: say 100M new URLs/month, 10B redirects/month
- Links live for years; custom aliases optional
2. Napkin math
- Writes: 100M / month ≈ 40 writes/sec
- Reads: 10B / month ≈ 4,000 reads/sec — a 100:1 read/write ratio
- Storage: 100M/month × ~500 bytes × 5 years ≈ 3 TB. Small!
The math tells you the shape: this is a read-heavy, cache-everything problem, not a big-data problem.
3. The short code
Base62-encode a unique ID ([a-zA-Z0-9]). 7 characters gives 62⁷ ≈ 3.5 trillion combinations — enough forever.
Where does the ID come from? A dedicated ID range allocator: each app server leases a block of IDs (say 100k) from a coordinator, then hands them out locally with zero contention. No hashing collisions to handle, no hot counter.
4. Architecture
- Redirects: HTTP 301 (permanent, cacheable) if analytics don't matter; 302 if they do — you want the hit to reach you.
- DB: anything key-value shaped works at 3 TB — DynamoDB fits perfectly (key = code, value = URL, TTL for expiry).
- Cache: the top 20% of links serve ~90% of traffic. A small Redis in front of the DB absorbs almost everything.
5. What breaks at scale (the discussion that gets you hired)
- Hot links (a viral tweet) → per-node in-memory cache in front of Redis
- Analytics at 4k rps → don't write per-click rows synchronously; push events to a queue (Kinesis/SQS) and aggregate
- Abuse → rate-limit creation, blocklist scanning on write
The lesson that generalizes: do the napkin math before drawing boxes — it tells you which problems are real and which are imaginary.