Disaster Recovery - SaaS
RunMyJobs allows cross-region disaster recovery (DR) by default.
To understand how cross-regional DR works with RunMyJobs, you need to understand a little bit about the RunMyJobs SaaS architecture. You must also understand RPO (Recovery Point Objective: the maximum amount of data loss that can be tolerated) and RTO (Recovery Time Objective: the amount of time it takes to recover from a disruptive event).
RunMyJobs SaaS Architecture
A RunMyJobs instance is tied to a particular AWS region. Within that region, it runs in a containerized, clustered environment.
Failover Within an AWS Region
An AWS region can include multiple Availability Zones (AZs), and RunMyJobs typically uses three AZs per region to ensure high availability within that region. If the AZ in which RunMyJobs is running goes down, the instance is automatically switched over to a different AZ, with an RPO of zero and a minimal RTO (due to the time it takes to spin up RunMyJobs in the new AZ).
Cross-Regional Failover
Every AWS region in which RunMyJobs runs has a designated secondary (failover) region. The secondary region is determined by Redwood and cannot be changed.
Hosting | Primary Region | Secondary Region |
---|---|---|
European | Dublin | Paris |
USA/Americas | Oregon | Ohio |
USA/Americas | Ohio | Oregon |
Germany | Frankfurt | Zurich |
Asia Pacific | Sydney | Melbourne |
Asia Pacific | Singapore | Sydney |
If a disaster (a region-wide sustained AWS outage with no ETA or a long ETA) occurs, all environments are brought up in the designated backup AWS region. Because the backup region is dedicated, the database and files are automatically synchronized, so that data and job processing losses are minimized.
For details on RTO and RPO times, refer to the Redwood Support Guide.