High Availability is the fundamental feature of building software solutions in a cloud environment. Traditionally high availability has been a very costly affair but now with AWS, one can leverage a number of AWS services for high availability or potentially “always availability” scenario. In this blog, I’ll be mentioning some tips to attain high availability for your application deployed on Amazon Web Services.

Before starting, let us see how the AWS Infrastructure is organized across the globe.

1. AWS is spread across the world in multiple geographical locations called Regions. Refer to DIAG1 – REGIONS below. 2. Within each region, there are several isolated locations called Availability Zones. An availability zone can be seen as a separate data center within a region.

<DIAG1 – REGIONS>

AWS
AWS

TIP1: USE MULTIPLE AVAILABILITY ZONES

To build a highly available environment, the most recommended approach is to use multiple availability zones.

Assume a scenario of building a web application which needs to be highly available. A possible/simple deployment model for the same is to use Elastic Load Balancer in front of a cluster of your web application servers. Refer to DIAG2 – WEB TIER below.

One important point to note is: “The cluster should spread across multiple Availability Zones. The Availability Zones are fault isolated data centers which in turn maximize uptime for your Web Application Tier.”

<DIAG2 – WEB TIER>

AWs Web Tier
AWs Web Tier

TIP2: AVOID SINGLE POINT OF FAILURE

Let us extend the above scenario; the web application is now integrated to a database. The whole application’s availability now depends on the database. What if it goes down? This will have a severe impact on the availability of the overall application.

The simple approach to have high availability between the web application tier and database is to set up a failover cluster for the database which is inherently supported by Amazon RDS Multi-AZ deployment. Again, Multi-AZ deployment ensures maximum availability. This way we have avoided a “Single Point of Failure” scenario.

<DIAG3 – Database>

AWS
AWS

Some of the common point of failures and their mitigation

FAILURE POINTS

AWS SERVICES/SOLUTION

DNS/Domain Services

Route53

Load Balance

Elastic Load Balancing

Web/Application Server

Auto-Scaling

Database Servers

Redundant nodes or clustering

Authentication

Redundant nodes

Data Center Failures

Use Multiple Availability Zone

Disaster

Use Multiple Regions

TIP3: BUILD LOOSE COUPLING

A loosely coupled architecture is more fault tolerant, as components are not directly dependent on each other and the failure of one component does not bring the whole system down. Some of the key tricks for building loose coupling are:

• The application should be built on individual small modules. Each module should be a black box. They should be fairly independent. • Use queues to pass messages between these black box components. Use AWS services like Simple Work Flow, Simple Queue Service, Simple Notification Service, and CloudWatch. • Decouple the components by putting a load balancer between clusters. Use AWS services like Elastic Load Balancer, Route53, etc.

TIP4: IMPLEMENT ELASTICITY

Once we achieve a great level of loose coupling, we should plan for failures of any individual component of the overall system.

We must understand that these components will go down sometime sooner or later. Some of the key tricks for recovering from these failures are as below

• Use AWS Services like Elastic Cloud Compute, Auto-Scaling, Cloud Watch, and Elastic Load Balancer to quickly start a new EC2 nodes under a high load scenario to avoid the overall application goes down • Use AWS Services like Amazon Machine Image, Bootstrapping, and Cloud Formation to quickly build a new environment.

I hope the above provided suggestions will help you a long way in setting up highly available systems in AWS.

I will appreciate your feedback!!!