Secret revealed! Who is faster: AWS or Azure?

source: http://www.freeimages.com/photo/rollercoaster-2-1458004

Until now, there has been no easy way to compare one public Cloud service with another in a simple manner, e.g., who gives you bigger bang for the buck, and won’t do a bait and switch on you? Former refers to the instant performance of a server, and latter to the performance over time. It is like a roller coaster ride.

Enter Meghafind’s scouts, to compare a server with another one, anywhere in a private or public cloud, using 3 simple metrics of CPU, Memory and Storage performance. Furthermore, since jobs running a server in a public cloud will keep on changing in a multi-tenanted environment, no guarantee that faster now will remain so in the near future. We already reported fluctuations in a cloudy world in the previous blogs.

We first ran megaApp in measurement mode on a low-end T2 Micro server in AWS EC2, and compared its output with the lowest available server in Azure. To compares apples to apples, both machines were running Ubuntu OS, using the free VM rentals offered by both cloud vendors for a limited period. To our amazement, AWS came out 2X ahead, and this test was repeated several times just to be sure but results remained the same, as shown in Figure 1 below.

Figure 1: Meghafind Scout scores, higher is better.

Then after establishing a performance baseline on Azure’s virtual machine, we decided to observe its performance over time. Meghafind enables this with a monitor_loop option that puts negligible load on the server, and no other job was run. As observed in figure 2 below, performance fluctuated wildly, reason being Noisy Neighbor problem, wherein some other user’s job running in a separate VM on the same physical server uses shared resources, causing unexpected latency for our test VM running megaApp scouts.

Figure 2: Noisy neighbor on an Azure Linux VM.

Variations became worst if maximum readings are considered, instead of averages, as in figure 3. As you can see, performance issues get accentuated. Ideally, scout performance line should be flat and green, if no load is being run on the server, else your job will get less performance due to noisy neighbors.

Figure 3: Worst case scout readings on an Azure VM.

In order to diagnose the cause, we look at the data tables in Megafind’s repository where individual scout readings are collected for later analysis and reporting, showing individual resource loading levels and slow down for each type of scout.

Table 1: Details of Scout slow-downs on an Azure server

If you are interested to conduct a similar study for your private or public cloud servers, or need any other cloud performance analysis, please contact us here.