Tuesday, July 24, 2012

Why Network Latency Matters for Virtualized Applications

My colleague Russell Skingsley has an interesting take on the effects of latency on virtualized application performance. The purpose of virtualization is to optimize resource utilization. As Russell pointed out it isn’t just an academic conversation. For the cloud hosting provider it’s about revenue maximization. Network latency has a direct impact on virtualized application performance and therefore revenue for the service provider. Your choice of network infrastructure will impact your business, but it can be difficult to see how investing in high performance networking solutions will improve your business. I will try to connect the dots.

Don't Sell it Just Once
It’s a service provider axiom that you don’t want to waste your valuable assets by selling them to a single customer, even at a premium. Take dark fiber for instance, every SP has come to learn that selling a fiber to a single customer will never bring scalable revenues. Making the move to adding DWDM to the service mix allows you to scale up the number of customers and is an improvement over selling dark fiber, but is limited by the number of lamdas that can fit in the spectrum on a single fiber. The scaling is linear. A similar situation exists for shared cloud computing resources.

From Colo to Cloud
While many providers still seem to be content to sell co-location services from their data centers, they are effectively selling parking space for servers. They sell rack units and provide, cooling, power and network connectivity. Selling rack space like this is the equivalent of selling dark fiber. You sell it once and never see the benefits of incremental technology updates, which enable the selling a shared infrastructure. Once service providers see how they can increase their revenues by adopting virtualization they look for ways to maximize the returns.

revenueperrack.png

The way to do this is to maximize the number of virtual machines that you can on your infrastructure. If we graph the potential revenues per rack unit, the picture is rather starting and speaks for its self. No one should be selling rack space as colo services if they can do premium virtualized services instead. However order to achieve high revenues it is essential to have network underpinnings that will support these premium services. A legacy multi-tier network might be fine for colo services where their isn’t virtualization and simple applications are running one per server, but it will not cut it for virtual data center services – I’ll explain why.
The East-West Traffic Problem
In order to understand the importance of the network to all of this let’s take a look at a typical web-based multi-tenanted application. Even in a relatively simple application there are many layers, consisting of a web server layer, a Data Base server layer and application layer, and middleware processing. Each of these layers run processes that communicate with each other. This creates volumes of so called East-West traffic, before the data goes out to the users in the North-South direction. Each of these layers run on virtual machines in a multi-tenant hosting data center model.

application.png


While it might look simple from the customer’s point of view, the processes and the physical elements can be complex. vCloud for example will move virtual machines to attempt to load balance the physical server utilization without much regard for the network, so it is possible that all of VMs in this simple application would end up on different physical servers. This causes the various elements of the application to communicate across the network where latency then comes into play. Since there can be six strikes of the LAN during the processing of one transaction the impact of latency can be a major hit on application performance.

QFabric is the Platform
The need to minimize latency is what makes QFabric ideal for the delivery of virtualized applications. With a 1 tier architecture QFabric creates a flat network where any layer of the application can talk to another othe layer in one hop even across server racks. This diagram shows the described virtual application distributed across the data centre and connected by QFabric. Note that due to QFabric’s low and deterministic latency we can expect a maximum of 5 microsecond of latency under typical workloads for each strike of the LAN. This results in a typical total of 30 microseconds overhead from the network in delivery of a transaction from this web application.

qfabric.png


Challenges with the Trill Pile
If you look at a typical competing system the picture is not so great. Other DC solutions have orders of magnitude greater latency times than QFabric when considered end-to-end due to their multi-tier architecture based on TRILL. In this model the applications must communicate up and down the network tiers. Using the typical latency claims of our nearest competitor it is 20 times slower. This has significant impact on the overall performance of the data center in delivering virtualized applications. It means that application performance suffers. It means that fewer transactions can be delivered per second. This ultimately places a ceiling on the number of VMs that can be sold from this data center.


trillpile.png

Our recommendation
1. Build Juniper Qfabric DC Network
2. Populate it with the maxium number virtual servers
3. Serve many more customers with high performance applications
4. See your revenues and profits increase.

As Russell says “A Juniper DC network solution will enable orders of magnitude more revenue per rack unit for your data centre”. So the message is simple – give us the chance to POC it and we will prove it.

For more information see the QFabric page, link.

This article first appeared on my Juniper blog, see link.

No comments:

Post a Comment