Clustering And Operational Cost

Posted by Optimation Editor on 11 May 2012

Clustering is a great deployment topology if your primary goal is to avoid service disruption. Effectively it gives you two or more copies of not only your application but also (depending on your application and server) the customer state. This allows a customer to continue to operate on another instance if the first instance they were using becomes unavailable.

This reliability of service comes with some significant costs:

Increased hardware costs
Increased licensing costs
Increased network traffic
Increased operational management cost.

The first three items are well understood in many circles and are therefore not part of the scope of this blog entry. The operational cost is one that is frequently overlooked.

Operating machines in a cluster can in fact reduce operational cost. Being able to take out a node of a cluster to perform some maintenance on the hardware or OS without impacting customers gives you great flexibility to maintain systems during normal business hours. The problem comes when a new code or database version is to be deployed.

If you wish to be able to deploy a new version of your application when operating in a clustered environment, you'll need to carefully consider:

Whether the shared state between your instances is compatible between the old and new version
Whether the application server (if one is used) allows either parallel deployment of versions or deployment of a version to a single node at a time
Whether your release contains any disruptive database changes

Lets look at these three in detail.

With shared state between instances, the likely hood of problems arising at deployment time between the old and new versions depends on:

How much state is replicated
How volatile the structure of that state is
Whether you can maintain backwards compatibility of that state in your application code

With careful forethought and a backwards compatible implementation, this problem can be engineered around.

When it comes to deploying the new code, if your application or runtime environment supports parallel deployment (e.g. Weblogic side-by-side deployment) you can deploy your new and old versions in parallel and have new customer sessions hit the new version, causing the old version to slowly become unused and removable. If on the other hand, your runtime environment only allows deployment of a new version across the entire cluster in one go, then there is no way to avoid having a brief service outage while the old version is un-deployed and the new version deployed and started.

Finally, care must be taken around database changes. If you're expecting to change your database structure often then a layer of abstraction between the physical schema and your code is a good idea. The use of views and/or stored procedures can give you some more flexibility between your application code and the physical database which in turn can be used to make it possible for your old version to operate on an updated schema. This does of course come with a development complexity cost as well as making the database less portable between DBMS'.

So in conclusion, there is no hard and fast rule about whether clustering is a good idea. It depends on the application architecture and what the business hopes to achieve through the deployment of the cluster. It is important to consider the tradeoffs before embarking on any software deployment whether it is a green fields or an infrastructure upgrade. Ideally, consider these before a software package is designed or purchased.