Skip to main content

Design a URL Shortener (TinyURL)

This is one of the most common system design interview questions. It seems simple but covers a surprisingly wide set of topics: hashing, databases, caching, and scalability.

Step 1: Clarify Requirementsโ€‹

Functional requirements:

  • Given a long URL, generate a short URL (e.g., tiny.url/abc123)
  • Redirect users from short URL to original long URL
  • Optional: Custom aliases, link expiration, analytics

Non-functional requirements:

  • 100M short URLs created per day
  • 10B redirects per day (100:1 read/write ratio โ€” extremely read-heavy)
  • Short URLs should be unique and hard to guess
  • Redirect should be fast (< 10ms P99)
  • System should be highly available (99.99% uptime)

Step 2: Estimate Scaleโ€‹

Writes: 100M URLs/day = ~1,200 writes/sec
Reads: 10B redirects/day = ~115,000 reads/sec

URL size:
- Long URL: ~2 KB average
- Short URL key: 7 characters = 7 bytes
- Metadata (user, timestamp, expiry): ~500 bytes
- Total per URL: ~2.5 KB

Storage for 10 years:
- 100M URLs/day ร— 365 ร— 10 ร— 2.5 KB โ‰ˆ 900 TB
(With replication: ~3 PB โ€” need distributed storage)

Bandwidth:
- Write: 1,200 ร— 2.5 KB = 3 MB/sec
- Read: 115,000 ร— 500 bytes (just the redirect) = 57 MB/sec

Step 3: High-Level Designโ€‹

                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Load Balancer โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”
โ”‚ URL โ”‚ โ”‚ URL โ”‚ โ”‚ URL โ”‚
โ”‚ Service โ”‚ โ”‚ Service โ”‚ โ”‚ Service โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Redis Cache โ”‚
โ”‚ (hot URLs) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ cache miss
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Database โ”‚
โ”‚ (URL store) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

API Designโ€‹

POST /api/v1/shorten
Body: { longUrl: "https://...", customAlias?: "mylink", expiresAt?: "2025-12-31" }
Response: { shortUrl: "https://tiny.url/abc1234" }

GET /:shortCode
Response: 301/302 redirect to longUrl

DELETE /api/v1/:shortCode
Response: 200 OK

GET /api/v1/:shortCode/stats
Response: { clicks: 1234, createdAt: "...", ... }

301 vs 302 redirect:

  • 301 Permanent โ€” Browser caches the redirect; reduces server load but you can't update the destination
  • 302 Temporary โ€” Browser always asks the server; allows analytics tracking and URL updates
  • For a URL shortener, 302 is usually preferred (lets you track clicks)

Step 4: Key Design Decisionsโ€‹

Generating Short Codesโ€‹

You need a 7-character code from [a-zA-Z0-9] = 62 characters.

62^7 = ~3.5 trillion possible URLs โ€” more than enough.

Option A: Hash + Encode (Recommended)

const crypto = require('crypto');

function generateShortCode(longUrl) {
// SHA-256 hash of the URL
const hash = crypto.createHash('sha256').update(longUrl).digest('hex');

// Take first 7 characters of base62-encoded hash
return base62Encode(hash).substring(0, 7);
}

function base62Encode(hex) {
const chars = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
let num = BigInt('0x' + hex);
let result = '';
while (num > 0) {
result = chars[Number(num % 62n)] + result;
num = num / 62n;
}
return result;
}

Problem: Hash collisions. Two different long URLs could produce the same 7-character code.

Solution: Check if the code exists in the DB. If it does, append a counter and rehash.

Option B: Counter + Encode

Maintain a global counter. Each URL gets the next counter value, encoded in base62.

Counter: 1000000
Base62: 4c92
Short code: 4c92

Problem: Single point of failure for the counter. Predictable codes (security risk).

Solution: Use distributed unique ID generation (Snowflake IDs).

Option C: Pre-generated Key Pool

Pre-generate millions of random 7-character codes offline, store them in a "Key DB." Service picks one when needed.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Key โ”‚ fetch โ”‚ โ”‚
โ”‚ Generator โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ–บ โ”‚ Key DB โ”‚
โ”‚ (offline) โ”‚ โ”‚ (unused) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ assign
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ URL DB โ”‚
โ”‚ (used) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

This is the cleanest approach for high-scale systems.


Database Designโ€‹

-- URL table
CREATE TABLE urls (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_code VARCHAR(10) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT,
expires_at TIMESTAMP,
created_at TIMESTAMP DEFAULT NOW(),
click_count BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);

Database choice:

  • Use a NoSQL key-value store (like Cassandra or DynamoDB) for the main URL table โ€” the access pattern is a simple key lookup
  • The key is short_code, value is long_url + metadata
  • High write volume (1,200/sec) is handled well by Cassandra's write-optimized design

Caching Strategyโ€‹

Since 80% of traffic goes to 20% of URLs (power law distribution), caching is extremely effective.

Cache design:

  • Use Redis with LRU eviction policy
  • Cache key: short_code, value: long_url
  • Cache hit rate target: 95%+
  • TTL: 24 hours (or match URL expiration)

Lookup flow:

1. Check Redis cache
โ†’ Hit: Return long_url, 302 redirect (< 1ms)
โ†’ Miss: Query database, cache result, redirect (< 10ms)

With 95% cache hit rate:

  • 115,000 reads/sec ร— 0.95 = 109,250 served from cache
  • 115,000 ร— 0.05 = 5,750 DB queries/sec (easily handled)

Handling Expirationโ€‹

// When creating a URL
async function createShortUrl(longUrl, expiresAt) {
const shortCode = generateShortCode(longUrl);

await db.insert({
short_code: shortCode,
long_url: longUrl,
expires_at: expiresAt
});

if (expiresAt) {
// Set Redis TTL to match expiration
const ttlSeconds = (expiresAt - Date.now()) / 1000;
await redis.setex(shortCode, ttlSeconds, longUrl);
} else {
await redis.set(shortCode, longUrl);
}

return shortCode;
}

// When resolving a URL
async function resolveUrl(shortCode) {
// Check cache first
let longUrl = await redis.get(shortCode);

if (!longUrl) {
const record = await db.findOne({ short_code: shortCode });
if (!record || (record.expires_at && record.expires_at < new Date())) {
throw new Error('URL not found or expired');
}
longUrl = record.long_url;
await redis.set(shortCode, longUrl, { EX: 3600 });
}

// Track click asynchronously (don't block the redirect)
trackClick(shortCode).catch(console.error);

return longUrl;
}

Analytics (Bonus Deep Dive)โ€‹

Track click analytics without impacting redirect latency:

[Redirect Service]
โ”‚ async
โ–ผ
[Kafka Topic: url_clicks]
โ”‚
โ–ผ
[Analytics Consumer]
โ”‚
โ–ผ
[ClickHouse / Data Warehouse]

Each click event:

{
"short_code": "abc1234",
"timestamp": "2024-01-15T10:30:00Z",
"ip_address": "1.2.3.4",
"user_agent": "Mozilla/5.0...",
"referer": "https://google.com"
}

Final Architectureโ€‹

                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ DNS / CDN โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Load Balancer โ”‚
โ”‚ (HAProxy / NGINX) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ URL Write โ”‚ โ”‚ URL Read โ”‚ โ”‚ URL Read โ”‚
โ”‚ Service โ”‚ โ”‚ Service โ”‚ โ”‚ Service โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”โ”‚
โ”‚ โ”‚ Redis Cluster (Cache) โ”‚โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜โ”‚
โ”‚ โ”‚ miss โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Cassandra Cluster โ”‚
โ”‚ (URL Key-Value Store) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Kafka โ”‚
โ”‚ (click events) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ClickHouse โ”‚
โ”‚ (analytics) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Interview Follow-up Questionsโ€‹

Q: How would you handle 10ร— traffic growth?

Add more URL service instances behind the load balancer, expand the Redis cluster, and add Cassandra nodes. The system is horizontally scalable at every layer.

Q: How do you prevent abuse (spam/phishing)?

Add URL validation against a blocklist, rate limit by IP/user, and integrate with safe browsing APIs (Google Safe Browsing).

Q: How would you add custom domains?

Allow users to CNAME their domain to your service, store the domain in the URL record, and handle routing in the DNS layer.

Q: How do you handle the global distribution?

Deploy the read service in multiple regions (US, EU, Asia), replicate Cassandra across regions, and use anycast DNS to route users to the nearest datacenter.