Skip to main content
senior
advanced
Architecture

Disaster Recovery Planning

Question

How do you design a disaster recovery strategy? Explain RPO, RTO, and different DR approaches.

Answer

RPO (Recovery Point Objective) is the maximum acceptable data loss measured in time - how much data can you afford to lose? RTO (Recovery Time Objective) is the maximum acceptable downtime - how quickly must you recover? DR strategies from cheapest to most expensive: Backup and Restore (high RPO/RTO, low cost), Pilot Light (minimal always-on resources), Warm Standby (scaled-down duplicate environment), and Multi-Site Active-Active (near-zero RPO/RTO, highest cost). Choose based on business requirements and criticality.

Why This Matters

Disaster recovery is essential for business continuity. Senior engineers must understand the trade-offs between cost, complexity, and recovery capabilities. The right strategy depends on the cost of downtime versus the cost of redundancy.

Code Examples

DR Strategy Comparison

text
Common Mistakes
  • Never testing the DR plan until a real disaster
  • Not considering application-level DR, only infrastructure
  • Underestimating the complexity of data consistency across regions
Follow-up Questions
Interviewers often ask these as follow-up questions
  • How do you test your disaster recovery plan?
  • What is the difference between disaster recovery and high availability?
  • How do you handle data replication across regions for DR?
Tags
disaster-recovery
business-continuity
rpo
rto
senior