Skip to main content

CDN Image Delivery Under 50ms

Explain how a CDN serves images to users worldwide in under 50ms.

mid
intermediate
System Design
Question

Explain how a CDN serves images to users worldwide in under 50ms.

Answer

A CDN (Content Delivery Network) places copies of your images on hundreds of edge servers distributed across the globe. When a user requests an image, DNS routes them to the nearest edge server (often called a Point of Presence or PoP), typically within 20-50km. If that edge server already has the image cached, it serves it directly from memory or local SSD, which takes 1-5ms of server processing plus the network round-trip to the nearby PoP, totaling well under 50ms. If the edge does not have the image (a cache miss), it fetches from a regional mid-tier cache or the origin server, caches the response, and then serves it. The key is that after the first request, all subsequent users in that region get the cached copy. CDNs also use anycast routing, where multiple servers share the same IP address and the network layer routes packets to the closest one, and persistent TCP/TLS connections between PoPs to minimize connection setup overhead.

Why This Matters

This question tests whether candidates understand the physical constraints of network latency and how CDNs work around them by moving data closer to users. Light in fiber travels at roughly 200,000 km/s, so a round-trip from New York to London (11,000 km) takes at least 55ms just from physics. No amount of server optimization can beat the speed of light, which is why geographic distribution is the only solution for global sub-50ms delivery. Interviewers want to see that you grasp the caching hierarchy, cache invalidation challenges, and the tradeoffs between freshness and speed.

Code Examples

CDN caching hierarchy and request flow

text

Nginx edge cache configuration

nginx

Inspect CDN cache headers

bash
Common Mistakes
  • Not understanding that network latency is bounded by the speed of light, so geographic proximity is the primary optimization
  • Confusing CDN caching with browser caching or application-level caching
  • Ignoring cache invalidation, which is the hardest part of running a CDN
  • Thinking all requests hit the origin server and the CDN just provides 'faster pipes'
  • Forgetting about cache warming and the cold-start problem when deploying to new PoPs
Follow-up Questions
Interviewers often ask these as follow-up questions
  • How do you invalidate or purge cached content across all edge servers when the original image changes?
  • What is the difference between push-based and pull-based CDN strategies?
  • How would you handle personalized or user-specific content that cannot be cached at the edge?
  • What role does HTTP/2 and HTTP/3 (QUIC) play in reducing latency for CDN-served content?
  • How do you decide on TTL values for different types of content?
Tags
system-design
cdn
caching
networking
performance
latency
Sponsored
Carbon Ads

More System Design interview questions

Also worth your time on this topic