Streaming Platform Caching Architecture: CDN, HTTP Cache, and Frontend Delivery

LowHardJavascript
Preparing for interviews?

Use guided tracks for structured prep, then practice company-specific question sets when you want targeted interview coverage.

Quick Answer

Interview-focused guide to streaming platform caching architecture: browser and service worker caches, CDN edge strategy, origin shielding, cache invalidation, and metrics to optimize startup time, rebuffering, and infrastructure cost.

Answer

Definition (above the fold)

Designing caching for a streaming platform means deciding where each byte should be served from: browser cache, service worker, CDN edge, regional shield, or origin. The goal is not only speed. It is the right balance of startup latency, rebuffering rate, origin load, and delivery cost. A strong interview answer explains each layer and the invalidation policy that keeps content correct under high scale.

Core mental model

Treat content delivery as a pyramid: serve from the closest valid cache first, and only fall back to expensive upstream layers on miss. For video, use immutable segmented media; for metadata, use short-lived cache + revalidation.

Layer

What should live there

Why

Browser cache

Player assets, thumbnails, static JS/CSS

Zero network on repeat views

Service worker

App shell + selected API payloads

Offline/fast reload and controlled staleness

CDN edge

Video segments and static assets

Low latency and large origin offload

Regional shield cache

Hot objects near origin

Reduces thundering-herd origin traffic

Origin

Source-of-truth media + metadata APIs

Correctness and backfill on misses

Nearest valid cache wins; origin is the last resort.

Runnable example #1: cache policy by asset class

HTTP
# immutable versioned assets
Cache-Control: public, max-age=31536000, immutable

# metadata API (safe revalidation)
Cache-Control: public, max-age=30, stale-while-revalidate=60
ETag: "movie-842-v17"

# personalized API
Cache-Control: private, no-store
                  

This split prevents stale personalized responses while keeping static media aggressively cacheable.

Runnable example #2: service worker stale-while-revalidate for metadata

JAVASCRIPT
self.addEventListener('fetch', (event) => {
  const url = new URL(event.request.url);
  if (!url.pathname.startsWith('/api/catalog')) return;

  event.respondWith((async () => {
    const cache = await caches.open('catalog-v1');
    const cached = await cache.match(event.request);

    const network = fetch(event.request).then((res) => {
      if (res.ok) cache.put(event.request, res.clone());
      return res;
    }).catch(() => cached);

    return cached || network;
  })());
});
                  

Invalidation case

Pattern

Reason

New media encode

Versioned URL (content hash)

Avoids unsafe overwrite of cached blobs

Catalog metadata update

Short TTL + ETag revalidation

Keeps updates fresh with lower bandwidth

Emergency takedown

CDN purge by surrogate key

Removes stale/legal-sensitive content quickly

Deploy rollback

Asset manifest pinning

Prevents mixed old/new client bundles

Invalidation strategy is where most systems fail in production.

Common pitfalls

      • Caching personalized responses at CDN level without proper Vary/auth partitioning.
      • Using one cache policy for all resources (video segments and account APIs need different policies).
      • No cache-key versioning, causing stale assets after deploy.
      • Ignoring observability: no edge hit ratio, no startup/rebuffering measurement.

When to use / when not to use

Use aggressive immutable caching for versioned static assets and video segments with clear URL versioning. Use conservative caching for personalized or payment-sensitive APIs. Do not ship service-worker caching for critical data unless invalidation and rollback behavior are explicitly tested.

Interview follow-ups

Q1: Which metric tells you caching is working? A: CDN edge hit ratio plus improved time-to-first-frame and lower rebuffering.
Q2: How do you protect origin during traffic spikes? A: Origin shielding + collapsed forwarding + request coalescing.
Q3: How do you deploy safely? A: Versioned asset URLs, manifest pinning, and targeted purge keys for rollback.

Implementation checklist / takeaway

Define cache policy per resource class, use immutable versioned media, instrument hit/miss and playback metrics, and treat invalidation as a first-class design problem. Strong answers show both architecture and operational safeguards.

Similar questions
Guides
50 / 61