Design Netflix / Video Streaming
Netflix streams to 240M subscribers in 190+ countries. Video streaming is uniquely challenging because content is large (GBโTB per title), bandwidth is expensive, and users expect instant playback with no buffering.
Requirementsโ
Functional:
- Upload and transcode videos
- Stream videos to users on different devices and network conditions
- Search and browse content catalog
- Recommendation system
- Resume playback across devices
Non-functional:
- 240M subscribers, 100M+ hours streamed daily
- < 2 second startup time for videos
- Support adaptive bitrate streaming (buffer-free on slow connections)
- 99.99% availability
Scale Estimationโ
Content library: ~15,000 titles
Each title stored at 5 resolutions ร 3 codecs = 15 versions
Average movie file: ~5 GB โ 15 versions = ~75 GB per title
Total storage: 15,000 ร 75 GB = ~1.1 PB (just originals)
Streaming:
- 100M hours/day = ~1.15M concurrent streams
- Average bitrate: 8 Mbps
- Bandwidth: 1.15M ร 8 Mbps = 9.2 Tbps peak
This is why Netflix owns ~15% of global internet bandwidth.
Architecture Overviewโ
Netflix architecture has three major components:
1. Content Ingestion Pipelineโ
Content Creator uploads master file
โ
โผ
[Upload Service] โ S3 (raw storage)
โ
โผ
[Transcoding Service]
โโโ Encode to H.264 (broad device support)
โโโ Encode to H.265/HEVC (50% better compression)
โโโ Encode to AV1 (best compression, newer devices)
โ
โโโ Resolution variants: 4K, 1080p, 720p, 480p, 360p
โ
โโโ Split into 4-second chunks (for ABR streaming)
โ
โผ
[CDN Upload] โ Distribute chunks to 1,000+ CDN edge nodes
2. Adaptive Bitrate Streaming (ABR)โ
The secret to buffer-free streaming is serving different quality levels based on network conditions.
How ABR works:
1. Client downloads a manifest file (M3U8 or DASH):
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=640x360
/movie/360p/chunk_000.ts
#EXT-X-STREAM-INF:BANDWIDTH=3000000,RESOLUTION=1280x720
/movie/720p/chunk_000.ts
#EXT-X-STREAM-INF:BANDWIDTH=8000000,RESOLUTION=1920x1080
/movie/1080p/chunk_000.ts
2. Client measures download speed of current chunk
3. If bandwidth > 8 Mbps โ switch to 1080p for next chunk
If bandwidth < 3 Mbps โ switch to 720p
If bandwidth < 800 Kbps โ switch to 360p
4. Transitions are seamless โ user doesn't notice quality changes
class ABRController {
constructor(manifestUrl) {
this.tracks = [];
this.currentTrackIndex = 0;
this.bufferLength = 0;
}
async loadManifest(url) {
const manifest = await fetch(url).then(r => r.text());
this.tracks = parseM3U8(manifest).sortBy('bandwidth');
return this.tracks;
}
selectQuality(downloadSpeed, bufferLength) {
// If buffer is low, be conservative โ don't switch up
if (bufferLength < 10) {
// Drop quality to fill buffer faster
return Math.max(0, this.currentTrackIndex - 1);
}
// Find highest quality that fits in available bandwidth
// Use 80% of measured speed to account for variance
const targetBandwidth = downloadSpeed * 0.8;
let selectedIndex = 0;
for (let i = 0; i < this.tracks.length; i++) {
if (this.tracks[i].bandwidth <= targetBandwidth) {
selectedIndex = i;
}
}
return selectedIndex;
}
async downloadNextChunk() {
const track = this.tracks[this.currentTrackIndex];
const chunkUrl = track.nextChunkUrl();
const start = performance.now();
const chunk = await fetch(chunkUrl);
const elapsed = (performance.now() - start) / 1000; // seconds
const downloadSpeed = (track.chunkSizeBytes * 8) / elapsed; // bits/sec
this.currentTrackIndex = this.selectQuality(downloadSpeed, this.bufferLength);
return chunk;
}
}
3. Content Delivery Network (CDN)โ
Netflix's CDN (Open Connect) is the key to its performance:
User in London requests "Stranger Things"
โ
โผ
DNS resolves to nearest Open Connect Appliance (OCA)
โ
โผ
OCA (inside ISP data centers, e.g., BT, Virgin Media)
โ
โโโ Cache hit? โ Serve directly (< 1ms latency)
โโโ Cache miss? โ Pull from Netflix origin โ cache โ serve
Netflix actually ships OCA hardware to ISPs for free. By co-locating content inside ISPs, they avoid expensive backbone bandwidth and reduce latency dramatically.
Cache warming: Popular titles are pre-pushed to edge nodes overnight (off-peak hours):
async function warmCacheForNewRelease(titleId) {
const popularRegions = await getRegionsByPopularity();
const allEdgeNodes = await getEdgeNodesForRegions(popularRegions);
// Pre-push the top 3 quality levels for the first 30 minutes of content
const files = await generateWarmingList(titleId, {
qualities: ['1080p', '720p', '480p'],
durationSeconds: 1800, // 30 minutes
});
await Promise.all(allEdgeNodes.map(node =>
node.prefetch(files)
));
}
Video Encoding Pipelineโ
Netflix processes thousands of hours of content per day.
Upload (raw 4K ProRes master)
โ
โผ
[Scene Analysis] โ Detect dark scenes, action sequences, credits
โ
โผ
[Per-Scene Encoding] โ Allocate more bits to complex scenes
โ ("per-title encoding" + "per-shot encoding")
โผ
[Quality Validation] โ VMAF score check (perceptual quality metric)
โ
โผ
[Manifest Generation] โ Create HLS/DASH playlists
โ
โผ
[CDN Distribution] โ Push to edge nodes
Netflix's innovation โ Per-Title/Per-Shot Encoding:
Instead of encoding at a fixed bitrate, Netflix analyzes each title and determines the optimal bitrate for each quality level. A simple animated show needs less bits than an action film with complex scenes.
Simple animation (Bojack Horseman):
1080p: 2 Mbps (instead of standard 8 Mbps)
Complex live action (Stranger Things):
1080p: 8 Mbps
Result: Better quality at lower bandwidth consumption
Data Modelโ
-- Content catalog
CREATE TABLE titles (
title_id UUID PRIMARY KEY,
title VARCHAR(200),
description TEXT,
release_year INT,
genres VARCHAR[] (array type),
rating VARCHAR(10),
duration_ms BIGINT,
created_at TIMESTAMP
);
-- Video assets
CREATE TABLE video_assets (
asset_id UUID PRIMARY KEY,
title_id UUID REFERENCES titles(title_id),
codec ENUM('h264', 'h265', 'av1'),
resolution VARCHAR(20), -- '1080p', '720p', etc.
bitrate INT,
manifest_url TEXT,
cdn_prefix TEXT,
created_at TIMESTAMP
);
-- Playback state (for resume feature)
CREATE TABLE playback_state (
user_id UUID,
title_id UUID,
position_ms BIGINT, -- Playback position in milliseconds
device_id UUID,
updated_at TIMESTAMP,
PRIMARY KEY (user_id, title_id)
);
-- Watch history
CREATE TABLE watch_history (
user_id UUID,
title_id UUID,
watched_at TIMESTAMP,
percent_watched FLOAT, -- 0.0 to 1.0
PRIMARY KEY (user_id, title_id, watched_at)
);
Recommendation Systemโ
Netflix's recommendation system drives 80% of content discovery.
Collaborative Filtering: "Users who watched X also watched Y"
Content-Based Filtering: "This is similar to titles you liked"
Matrix Factorization: Learn latent factors from watch history
Data pipeline:
Watch history โ Feature extraction โ Model training โ Predictions โ A/B test โ Serve
// Simplified recommendation service
class RecommendationService {
async getRecommendations(userId, limit = 20) {
// Check cached recommendations first
const cached = await redis.get(`recs:${userId}`);
if (cached) return JSON.parse(cached);
// Get user's watch history and ratings
const watchHistory = await getWatchHistory(userId);
const userPreferences = await getUserPreferences(userId);
// Get pre-computed recommendations from ML model
const recommendations = await mlService.predict({
userId,
watchHistory: watchHistory.slice(0, 100), // Last 100 items
preferences: userPreferences,
context: { time: new Date().getHours(), device: 'tv' }
});
// Filter out already-watched content
const watchedIds = new Set(watchHistory.map(w => w.titleId));
const filtered = recommendations
.filter(r => !watchedIds.has(r.titleId))
.slice(0, limit);
// Cache for 1 hour
await redis.setex(`recs:${userId}`, 3600, JSON.stringify(filtered));
return filtered;
}
}
Interview Follow-up Questionsโ
Q: How do you handle startup time?
Pre-buffer the first few seconds of the video before starting playback. Choose a lightweight codec and low bitrate for the initial segment, then switch up quality as the buffer fills.
Q: How do you implement DRM (Digital Rights Management)?
Use industry standards: Widevine (Google/Chrome), FairPlay (Apple), PlayReady (Microsoft). Encrypt content keys, deliver via license server with user authentication. Encrypted Media Extensions (EME) in browsers handle decryption.
Q: How do you detect and prevent account sharing?
Track login location, device fingerprints, and concurrent stream count. Flag accounts with > N concurrent streams from different locations. Challenge with re-authentication when suspicious patterns are detected.
Q: How would you scale the transcoding service?
Use a job queue (e.g., AWS SQS) to distribute transcoding jobs. Each job processes one "chunk" of video. Use spot instances for cost efficiency. Scale workers up during new content releases and down during off-peak hours.