CDN Image Delivery Under 50ms

Explain how a CDN serves images to users worldwide in under 50ms.

mid

intermediate

System Design

Question

Explain how a CDN serves images to users worldwide in under 50ms.

Answer

A CDN (Content Delivery Network) places copies of your images on hundreds of edge servers distributed across the globe. When a user requests an image, DNS routes them to the nearest edge server (often called a Point of Presence or PoP), typically within 20-50km. If that edge server already has the image cached, it serves it directly from memory or local SSD, which takes 1-5ms of server processing plus the network round-trip to the nearby PoP, totaling well under 50ms. If the edge does not have the image (a cache miss), it fetches from a regional mid-tier cache or the origin server, caches the response, and then serves it. The key is that after the first request, all subsequent users in that region get the cached copy. CDNs also use anycast routing, where multiple servers share the same IP address and the network layer routes packets to the closest one, and persistent TCP/TLS connections between PoPs to minimize connection setup overhead.

Why This Matters

This question tests whether candidates understand the physical constraints of network latency and how CDNs work around them by moving data closer to users. Light in fiber travels at roughly 200,000 km/s, so a round-trip from New York to London (11,000 km) takes at least 55ms just from physics. No amount of server optimization can beat the speed of light, which is why geographic distribution is the only solution for global sub-50ms delivery. Interviewers want to see that you grasp the caching hierarchy, cache invalidation challenges, and the tradeoffs between freshness and speed.

Code Examples

CDN caching hierarchy and request flow

text

Nginx edge cache configuration

nginx

Inspect CDN cache headers

bash

Common Mistakes

Not understanding that network latency is bounded by the speed of light, so geographic proximity is the primary optimization
Confusing CDN caching with browser caching or application-level caching
Ignoring cache invalidation, which is the hardest part of running a CDN
Thinking all requests hit the origin server and the CDN just provides 'faster pipes'
Forgetting about cache warming and the cold-start problem when deploying to new PoPs

Follow-up Questions

Interviewers often ask these as follow-up questions

How do you invalidate or purge cached content across all edge servers when the original image changes?
What is the difference between push-based and pull-based CDN strategies?
How would you handle personalized or user-specific content that cannot be cached at the edge?
What role does HTTP/2 and HTTP/3 (QUIC) play in reducing latency for CDN-served content?
How do you decide on TTL values for different types of content?

Also worth your time on this topic

Interview

DNS Resolution When You Type a URL

Walk me through what happens when you type a URL and press Enter, focusing specifically on the DNS resolution process.

junior

Article

How Does It Work So Fast? The Engineering Behind Instant UI Responses

Credit card validation, username checks, autocomplete, URL shorteners - they all feel instant. Here is what is actually happening under the hood in each case.

Exercise

Redis Caching Strategies for Scalable Applications

Implement production-ready caching patterns with Redis to dramatically improve application performance and scalability.

70 minutes

CDN Image Delivery Under 50ms

More System Design interview questions

Also worth your time on this topic

DNS Resolution When You Type a URL

How Does It Work So Fast? The Engineering Behind Instant UI Responses

Redis Caching Strategies for Scalable Applications