Recovery procedures restore system availability after disruptions
Even the most resilient systems can be disrupted. Availability commitments require not just preventing downtime, but recovering from it within defined timeframes. This criterion requires that backup and recovery procedures are in place, tested, and capable of restoring service within the recovery time objectives committed to customers.
Implementation steps
- 1
Define and document recovery objectives (RTO and RPO)
Recovery Time Objective (RTO) is how quickly you must restore service. Recovery Point Objective (RPO) is how much data loss is acceptable. Document these objectives for each critical system. Your backup frequency and replication strategy must be capable of meeting your RPO. Your incident response and recovery procedures must be capable of meeting your RTO.
confluence notion google-docs - 2
Implement automated backups for all critical data
Configure automated backups for all databases and data stores that contain customer data or data required for service operation. Backups should run at least daily; more frequently for high-RPO systems. Store backups in a separate location from production (separate region or account). Retain backups according to your RPO and any regulatory requirements.
aws-backup aws-rds google-cloud-backup azure-backup - 3
Test recovery procedures at least annually
Backups and recovery procedures that have never been tested are often broken when needed most. At least annually, restore from backup in a non-production environment and verify data integrity and completeness. Document the test results including actual recovery time. Use the test to validate or update your RTO/RPO targets.
aws-backup aws-rds confluence notion
Evidence required
Backup configuration and retention settings
Evidence that automated backups are configured for critical systems.
- - AWS RDS automated backup configuration showing retention period
- - Database backup schedule and retention policy
- - AWS Backup plan configuration
Recovery test results
Evidence that backups and recovery procedures have been tested.
- - Backup restoration test report with RTO measurement
- - Disaster recovery exercise notes
- - Database restore test showing successful data validation
RTO and RPO documentation
Documented recovery objectives for critical systems.
- - Business continuity plan with RTO and RPO targets
- - Disaster recovery plan with recovery objectives per system
- - Service level commitments referencing recovery targets