see more blog

10 Tips for Developing an AWS Disaster Recovery Plan

AWS Disaster Recovery

Even a minimal interruption of service can mean disaster for an organization, implying thousands of dollars in data loss. A disaster can be caused by a security attack, a natural disaster or human error. Business continuity is critical for any company in the cloud. A solid disaster recovery plan help organizations stay up in the event of failure or attack.

One of the leading cloud vendors, Amazon Web Services (AWS), provides its users with features to help them build their own Disaster Recovery Solution. In this article, I aim to cover what is a Disaster Recovery Plan (DRP) for AWS and I’ll offer 10 tips to leverage the functions in your AWS console to prevent and recover from a disaster.

AWS Disaster Recovery Plan Overview

A Disaster Recovery Plan (DRP) is a structured and detailed set of instructions geared to recover system and networks in the event of failure or attack, with the aim to help the organization back to operational as fast as possible.

Deploying an on-premises disaster recovery solution usually involves high costs of implementation and maintenance. Therefore, many companies leverage the disaster recovery tools and solutions provided by their cloud vendors, such as AWS or Azure. These solutions may be offered by third-party vendors — for example, AWS partners with companies such as N2WS and Cloudberrylab that offer disaster recovery solutions tailored to AWS.

AWS users can derive several benefits from developing a recovery plan and having it ready such as:

  •  — protects critical data by establishing replication intervals
  •  — minimizing downtime
  •  — by using AWS cross-region disaster recovery
  •  — requires minimal time to retrieve files and data, thus restoring operations

10 Tips For Developing an AWS Disaster Recovery Plan

What resources compose the core of your business? A Business Impact Analysis (BIA) can help give you a picture of which areas can become more affected in the event of a threat. It also can guide you to preview the potential impact of a disaster in operations.

You should know how much system downtime your organization can afford before suffering irreparable monetary losses. Therefore, calculating your recovery time objective is critical for a successful recovery plan. Moreover, you need to calculate how much data loss your organization can absorb before incurring too much damage — that is the recovery point objective. For example, if losing 4 hours of data will cause too much damage, then you need to account for an RPO of much less than 4 hours.

There are four main recovery methods you can choose according to your organization requirements and preferences:

  •  — you can use a managed solution to backup and restore data on a need-to-do basis. However, the restoration can consume a lot of time and resources as the system does not keep data on standby.
  •  — keep a core of critical applications and data running to enable quick retrieving in the event of a disaster.
  •  — this involves duplicating the system’s core elements and keeping them running on standby at all times. In the event of a disaster, this duplicate can be promoted to primary to maintain operations.
  •  — make a full replica of the data and applications, deploying it in two or more active locations. You can then split the traffic between them, so in the event of a disaster, the system simply reroutes everything to an undamaged region.

For example, you can implement detective measures such as server and network monitoring software. Corrective measures as remediation tools can help restore a system after a disaster.

Schedule testing while developing your DRP can help you catch flaws before you need to implement the plan. This can ensure your plan is well oiled before a disaster or threat occurs.

You should update your plan on a regular basis, to catch up with system changes. In the aftermath of a threat, this forms part of lessons learned, refining the plan to prevent further attacks or failures.

Scheduling regular backups of what you have stored on Amazon EC2 and EBS volumes could be insufficient to face a disaster. You need to have quick access to the data in the event of a disaster. A detailed and up-to-date AWS disaster recovery plan can help you recover and restore the backup data from the cloud environment with minimal downtime.

While developing your plan you need to decide where the critical data will be stored. To avoid getting your entire system knocked offline, you should distribute the data across different availability zones (AZ) around the world.

For example, you can use cross-region replication for S3. S3’s duplicates the data to multiple locations within a region by default, creating high durability. However, this does not eliminate the risk of data loss in a given region. To prevent this, you can use the cross-region replication option, automating the copying of the data to a designated bucket in another region.

You can also use global tables in DynamoDB to deploy a multi-region multi-master database. This spreads the changes across several tables. Since the data is distributed in different regions, minimizes the risk of data loss.

Needless to say, you should keep your root passwords and credentials secure and hidden from non-authorized users, even disabling the programmatic keys once they are used, to prevent internal threats. Setting a multi-factor authentication solution can ensure the administrator and programmatic privileges don’t fall into malicious hands.

While it may be tempting to implement all steps of a disaster recovery plan in-house, smaller companies lacking a dedicated IT team find it easier to use a third-party solution. Disaster recovery-as-service companies help organizations to develop, implement, and maintain their DRPs, enabling them to focus on growing their businesses.

AWS Disaster Recovery Options

Let’s say you migrated to the cloud using the rehosting method and you use EC2 instances for your application. There are several ways to begin leveraging AWS functions to develop a DR plan:

  •  — allow you to make incremental backups of an EBS volume.
  •  — works similarly to an EBS snapshot, contains metadata for the EC2 instance, and allows the entire EC2 instance to be restored.
  •  — a serverless product that allows you to run code outside the application environment and at the same time access the AWS resources. You can use Lambda to automate tasks such as EBS snapshots.

Summary

Developing and implementing a disaster recovery plan for AWS requires a certain degree of ingenuity since AWS does not offer its own DR solution. However, the platform enables users to build a customized DR solution by repurposing some of the platform’s features and tools. In this article, I’ve aimed to give you some tips and tools to develop your own disaster recovery plan leveraging AWS environment.

Reference

https://medium.com/@eddies_47682/10-tips-for-developing-an-aws-disaster-recovery-plan-a708f899a442

About VTI Cloud

VTI Cloud is an Advanced Consulting Partner of AWS Vietnam with a team of over 50+ AWS certified solution engineers. With the desire to support customers in the journey of digital transformation and migration to the AWS cloud, VTI Cloud is proud to be a pioneer in consulting solutions, developing software, and deploying AWS infrastructure to customers in Vietnam and Japan.

Building safe, high-performance, flexible, and cost-effective architectures for customers is VTI Cloud’s leading mission in enterprise technology mission.

In addition, VTI Cloud supports building VIET-AWS community. This group is one of the fast-growing AWS User Groups and officially recognized by Amazon in the Asia Pacific (Vietnam) region.

VIET-AWS is a place to connect and exchange support between Solutions Architect, DevOps, SysOps, and budding students with cloud computing services of Amazon Web Services (AWS). Join VTI Cloud to join VIET-AWS: https://www.facebook.com/groups/vietawscommunity