Disaster Recovery Planning
How do you design a disaster recovery strategy? Explain RPO, RTO, and different DR approaches.
RPO (Recovery Point Objective) is the maximum acceptable data loss measured in time - how much data can you afford to lose? RTO (Recovery Time Objective) is the maximum acceptable downtime - how quickly must you recover? DR strategies from cheapest to most expensive: Backup and Restore (high RPO/RTO, low cost), Pilot Light (minimal always-on resources), Warm Standby (scaled-down duplicate environment), and Multi-Site Active-Active (near-zero RPO/RTO, highest cost). Choose based on business requirements and criticality.
Disaster recovery is essential for business continuity. Senior engineers must understand the trade-offs between cost, complexity, and recovery capabilities. The right strategy depends on the cost of downtime versus the cost of redundancy.
DR Strategy Comparison
- Never testing the DR plan until a real disaster
- Not considering application-level DR, only infrastructure
- Underestimating the complexity of data consistency across regions
- How do you test your disaster recovery plan?
- What is the difference between disaster recovery and high availability?
- How do you handle data replication across regions for DR?