CDN Image Delivery Under 50ms
Explain how a CDN serves images to users worldwide in under 50ms.
Explain how a CDN serves images to users worldwide in under 50ms.
A CDN (Content Delivery Network) places copies of your images on hundreds of edge servers distributed across the globe. When a user requests an image, DNS routes them to the nearest edge server (often called a Point of Presence or PoP), typically within 20-50km. If that edge server already has the image cached, it serves it directly from memory or local SSD, which takes 1-5ms of server processing plus the network round-trip to the nearby PoP, totaling well under 50ms. If the edge does not have the image (a cache miss), it fetches from a regional mid-tier cache or the origin server, caches the response, and then serves it. The key is that after the first request, all subsequent users in that region get the cached copy. CDNs also use anycast routing, where multiple servers share the same IP address and the network layer routes packets to the closest one, and persistent TCP/TLS connections between PoPs to minimize connection setup overhead.
This question tests whether candidates understand the physical constraints of network latency and how CDNs work around them by moving data closer to users. Light in fiber travels at roughly 200,000 km/s, so a round-trip from New York to London (11,000 km) takes at least 55ms just from physics. No amount of server optimization can beat the speed of light, which is why geographic distribution is the only solution for global sub-50ms delivery. Interviewers want to see that you grasp the caching hierarchy, cache invalidation challenges, and the tradeoffs between freshness and speed.
CDN caching hierarchy and request flow
Nginx edge cache configuration
Inspect CDN cache headers
- Not understanding that network latency is bounded by the speed of light, so geographic proximity is the primary optimization
- Confusing CDN caching with browser caching or application-level caching
- Ignoring cache invalidation, which is the hardest part of running a CDN
- Thinking all requests hit the origin server and the CDN just provides 'faster pipes'
- Forgetting about cache warming and the cold-start problem when deploying to new PoPs
- How do you invalidate or purge cached content across all edge servers when the original image changes?
- What is the difference between push-based and pull-based CDN strategies?
- How would you handle personalized or user-specific content that cannot be cached at the edge?
- What role does HTTP/2 and HTTP/3 (QUIC) play in reducing latency for CDN-served content?
- How do you decide on TTL values for different types of content?
More System Design interview questions
Also worth your time on this topic
DNS Resolution When You Type a URL
Walk me through what happens when you type a URL and press Enter, focusing specifically on the DNS resolution process.
junior
How Does It Work So Fast? The Engineering Behind Instant UI Responses
Credit card validation, username checks, autocomplete, URL shorteners - they all feel instant. Here is what is actually happening under the hood in each case.
Redis Caching Strategies for Scalable Applications
Implement production-ready caching patterns with Redis to dramatically improve application performance and scalability.
70 minutes