Interview-focused guide to streaming platform caching architecture: browser and service worker caches, CDN edge strategy, origin shielding, cache invalidation, and metrics to optimize startup time, rebuffering, and infrastructure cost.
Streaming Platform Caching Architecture: CDN, HTTP Cache, and Frontend Delivery
Use guided tracks for structured prep, then practice company-specific question sets when you want targeted interview coverage.
Definition (above the fold)
Designing caching for a streaming platform means deciding where each byte should be served from: browser cache, service worker, CDN edge, regional shield, or origin. The goal is not only speed. It is the right balance of startup latency, rebuffering rate, origin load, and delivery cost. A strong interview answer explains each layer and the invalidation policy that keeps content correct under high scale.
Core mental model
Treat content delivery as a pyramid: serve from the closest valid cache first, and only fall back to expensive upstream layers on miss. For video, use immutable segmented media; for metadata, use short-lived cache + revalidation.
Layer | What should live there | Why |
|---|---|---|
Browser cache | Player assets, thumbnails, static JS/CSS | Zero network on repeat views |
Service worker | App shell + selected API payloads | Offline/fast reload and controlled staleness |
CDN edge | Video segments and static assets | Low latency and large origin offload |
Regional shield cache | Hot objects near origin | Reduces thundering-herd origin traffic |
Origin | Source-of-truth media + metadata APIs | Correctness and backfill on misses |
Runnable example #1: cache policy by asset class
# immutable versioned assets
Cache-Control: public, max-age=31536000, immutable
# metadata API (safe revalidation)
Cache-Control: public, max-age=30, stale-while-revalidate=60
ETag: "movie-842-v17"
# personalized API
Cache-Control: private, no-store
This split prevents stale personalized responses while keeping static media aggressively cacheable.
Runnable example #2: service worker stale-while-revalidate for metadata
self.addEventListener('fetch', (event) => {
const url = new URL(event.request.url);
if (!url.pathname.startsWith('/api/catalog')) return;
event.respondWith((async () => {
const cache = await caches.open('catalog-v1');
const cached = await cache.match(event.request);
const network = fetch(event.request).then((res) => {
if (res.ok) cache.put(event.request, res.clone());
return res;
}).catch(() => cached);
return cached || network;
})());
});
Invalidation case | Pattern | Reason |
|---|---|---|
New media encode | Versioned URL (content hash) | Avoids unsafe overwrite of cached blobs |
Catalog metadata update | Short TTL + ETag revalidation | Keeps updates fresh with lower bandwidth |
Emergency takedown | CDN purge by surrogate key | Removes stale/legal-sensitive content quickly |
Deploy rollback | Asset manifest pinning | Prevents mixed old/new client bundles |
Common pitfalls
- Caching personalized responses at CDN level without proper Vary/auth partitioning.
- Using one cache policy for all resources (video segments and account APIs need different policies).
- No cache-key versioning, causing stale assets after deploy.
- Ignoring observability: no edge hit ratio, no startup/rebuffering measurement.
When to use / when not to use
Use aggressive immutable caching for versioned static assets and video segments with clear URL versioning. Use conservative caching for personalized or payment-sensitive APIs. Do not ship service-worker caching for critical data unless invalidation and rollback behavior are explicitly tested.
Interview follow-ups
Q1: Which metric tells you caching is working? A: CDN edge hit ratio plus improved time-to-first-frame and lower rebuffering.
Q2: How do you protect origin during traffic spikes? A: Origin shielding + collapsed forwarding + request coalescing.
Q3: How do you deploy safely? A: Versioned asset URLs, manifest pinning, and targeted purge keys for rollback.
Implementation checklist / takeaway
Define cache policy per resource class, use immutable versioned media, instrument hit/miss and playback metrics, and treat invalidation as a first-class design problem. Strong answers show both architecture and operational safeguards.