Automated in-AWS Failback for AWS Elastic Disaster Recovery

November 27, 2022

Steve Roberts

I first covered AWS Elastic Disaster Recovery (DRS) in a 2021 blog post. In that post, I described how DRS “enables customers to use AWS as an elastic recovery site for their on-premises applications without needing to invest in on-premises DR infrastructure that lies idle until needed. Once enabled, DRS maintains a constant replication posture for your operating systems, applications, and databases.” I’m happy to announce that, today, DRS now also supports in-AWS failback, adding to the existing support for non-disruptive recovery drills and on-premises failback included in the original release.

I also wrote in my earlier post that drills are an important part of disaster recovery since, if you don’t test, you simply won’t know for sure that your disaster recovery solution will work properly when you need it to. However, customers rarely like to test because it’s a time-consuming activity and also disruptive. Automation and simplification encourage frequent drills, even at scale, enabling you to be better prepared for disaster, and now you can use them irrespective of whether your applications are on-premises or in AWS. Non-disruptive recovery drills provide confidence that you will meet your recovery time objectives (RTOs) and recovery point objectives (RPOs) should you ever need to initiate a recovery or failback. More information on RTOs and RPOs, and why they’re important to define, can be found in the recovery objectives documentation.

The new automated support provides a simplified and expedited experience to fail back Amazon Elastic Compute Cloud (Amazon EC2) instances to the original Region, and both failover and failback processes (for on-premises or in-AWS recovery) can be conveniently started from the AWS Management Console. Also, for customers that want to customize the granular steps that make up a recovery workflow, DRS provides three new APIs, linked at the bottom of this post.

Failover vs. Failback
Failover is switching the running application to another Availability Zone, or even a different Region, should outages or issues occur that threaten the availability of the application. Failback is the process of returning the application to the original on-premises location or Region. For failovers to another Availability Zone, customers who are agnostic to the zone may continue running the application in its new zone indefinitely if so required. In this case, they will reverse the recovery replication, so the recovered instance is protected for future recovery. However, if the failover was to a different Region, its likely customers will want to eventually fail back and return to the original Region when the issues that caused failover have been resolved.

The below images illustrate architectures for in-AWS applications protected by DRS. The architecture in the image below is for cross-Availability Zone scenarios.

The architecture diagram below is for cross-Region scenarios.

Let’s assume an incident occurs with an in-AWS application, so we initiate a failover to another AWS Region. When the issue has been resolved, we want to fail back to the original Region. The following animation illustrates the failover and failback processes.

Learn more about in-AWS failback with Elastic Disaster Recovery
As I mentioned earlier, three new APIs are also available for customers who want to customize the granular steps involved. The documentation for these can be found using the links below.

The new in-AWS failback support is available in all Regions where AWS Elastic Disaster Recovery is available. Learn more about AWS Elastic Disaster Recovery in the User Guide. For specific information on the new failback support I recommend consulting this topic in the service User Guide.

— Steve

Automated in-AWS Failback for AWS Elastic Disaster Recovery

Introducing resource control policies (RCPs), a new type of authorization policy in AWS Organizations

AWS BuilderCards second edition at re:Invent 2024

AWS Weekly Roundup: 20 years of AWS News Blog, Express brokers for Amazon MSK, Windows Server 2025 images on EC2, and more (Nov 11, 2024)

Announcing new APIs for Amazon Location Service Routes, Places, and Maps

Introducing Express brokers for Amazon MSK to deliver high throughput and faster scaling for your Kafka clusters

Introducing the last cohort of AWS Heroes this year – November 2024

AWS Weekly Roundup: AWS Lambda, Amazon Bedrock, Amazon Redshift, Amazon CloudWatch, and more (Nov 4, 2024)

Fine-tuning for Anthropic’s Claude 3 Haiku model in Amazon Bedrock is now generally available

Unlock the potential of your supply chain data and gain actionable insights with AWS Supply Chain Analytics

Amazon Aurora PostgreSQL Limitless Database is now generally available

Simplify and enhance Amazon S3 static website hosting with AWS Amplify Hosting

Celebrating 10 Years of Amazon ECS: Powering a Decade of Containerized Innovation

AWS Weekly Roundup: New code editor in AWS Lambda console, Amazon Q Business analytics, Claude 3.5 upgrades, and more (October 28, 2024)

EC2 Image Builder now supports building and testing macOS images

Announcing three new capabilities for the Claude 3.5 model family in Amazon Bedrock

AWS Weekly Roundup: Agentic workflows, Amazon Transcribe, AWS Lambda insights, and more (October 21, 2024)

Amazon Aurora PostgreSQL and Amazon DynamoDB zero-ETL integrations with Amazon Redshift now generally available

AWS Weekly Roundup: What’s App, AWS Lambda, Load Balancers, AWS Console, and more (Oct 14, 2024)

Convert AWS console actions to reusable code with AWS Console-to-Code, now generally available

AWS Weekly Roundup: HIPAA eligible with Amazon Q Business, Amazon DCV, AWS re:Post Agent, and more (Oct 07, 2024)