Finding the optimal Availability Zone for your Microsoft Azure subscription

@20aman    Dec 12, 2021

When you start designing your solutions in Microsoft Azure for high availability, you think of Availability Zones. Within a region, these are physically separate locations that are tolerant to local failures. For your business-critical workloads, availability zones help you achieve resiliency and reliability.

Need for identifying the right numbers

The biggest caveat with Availability Zones is that they don't correspond to a specific location within a region. These are enumerated when you create a subscription and are different for each subscription (even within the same tenant). Even within a subscription, these are different for each region.

Let's look at an example. Let's assume you have two subscriptions within the "East US" location. One for Dev and another for Prod. Generally, you will see three availability zones, numbered 1, 2, and 3. The location that is referred to as number 1, may or may not be the same location that is referred to as number 1 in the Prod subscription.

Therefore it is important that for every subscription and for every region you find the optimal zones.

How to define optimal Availability Zones

The most optimal Availability Zone is the one that is the best in the below 2 factors:

  1. Latency - It should have the lowest latency
  2. Bandwidth - It should have the highest latency

Not that these are important when deploying the solution in highly available designs. Another concept related to HA design is identifying the primary and the secondary zones.

  • Selecting primary Availability Zone - Not that the one with the lowest latency and highest bandwidth should become your primary Availability Zone.
  • Selection secondary Availability Zone - The next best availability zone becomes the secondary zone.

Determining the Optimal Availability Zones

To determine the Optimal Availability Zones, you want to deploy 3 VMs in your subscription. Each VM should be deployed in a different Availability Zone. Then run the Latency and Bandwidth tests between each pair of VMs. Also, ensure to run the tests in both directions. E.g. from VM1 to VM2 and then from VM2 to VM1.

Automated Script

The whole process for finding the optimal availability zones has been automated. This is now as easy as running the script from the below source:

The script should be run from a VM within the Virtual Network of the subscription (for which you are trying to identify the optimal zones). This script automatically creates 3 VMs in 3 different availability zones and then runs the latency and bandwidth tests. In the end, it deletes the VMs and also provides you with a detailed report.

The report looks something like this:

Test Report

The top table shows you latency in micro seconds. Lower the number the better it is. As you can see zone 3 to zone 1 is the lowest latency of 56.1 micro seconds. Bandwith is shown in the bottom table and is measured in MBs transferred per second. Higher the number, the better it is. As you can see, zone 1 to zone 3, zone 3 to zone 2, show the highest number of 478 MB/sec. If you take both latency and bandwidth into consideration then zone 1 and zone 3 are the best combinations for this subscription.

Also, this test should be run at least 3 times and at different times of the day to get more in-depth and accurate results.

Changing an availability zone later is a tedious task. Hopefully, this post helps you determine the right availability zones from the start and lay a good foundation for your environment.





Comments powered by Disqus