When the performance of a web application starts to degrade, often IT managers jump to the conclusion that more hardware is required. This increased infrastructure can be expensive and may not even solve the underlying problem. Let’s talk about tuning as a solution to higher scalability. It’s been debated as to whether tuning is a science or an art but we can all agree on the goal: alleviate bottlenecks so that the web application can scale to a higher workload. Tuning is more efficient and cost effective than adding more hardware to your deployment. However, you need to have the right tools and expertise in place in order to successfully tune an environment.

Hardware servers are restricted to their physical resources (io, memory, cpu, et). With OS tuning, there are certain buffers and network configurations you can open up to increase overall capacity. Software servers are even more configurable because they use their own thread pools, caching, memory management, connection pools, etc. All of these software resources do actually run on a hardware server, but tuning software servers can allow it to take more or less advantage of hardware resources.

Tuning can allow or restrict throughput to an environment. Imagine a funnel. The large opening is on top, the small opening is on the bottom. A properly tuned environment will have more processing capacity on the front end (larger end of the funnel) and less dedicated processing at the end. This approach is used for a couple reasons, one is typically the front end of a deployment does more lightweight processing (webserver) and the backend (database) does more CPU intensive and shared resource processing. You want to allow as many requests to be processed on the front end without flooding the backend.

Most people do not know where to begin in the tuning process, so here are some tips to point you in the right direction. First you will need a load test solution which generates a realistic load. The solution also needs embedded monitoring for your entire deployment. For starters, don’t change a single thing! Simply identify and document all the configurable settings for each software server in the environment. You want to be especially aware of the settings which affect throughput – giving the server more or less processing power. For example, a webserver’s worker threads or database and application servers’ memory/threads/pools/buffers. Once you have all the settings documented, then start executing your load tests and monitoring all the configurable resources. Run the tests up to the points of degradation, don’t try to identify saturation points while the test is running, this is far too complicated. Once degradation occurs, stop the test and graph out your monitors. Identify the first major bottleneck, make a change to alleviate that bottleneck, rerun your test.

Keep to a methodical approach, change only 1 variable at a time. It is also greatly helpful if the tool has a built-in comparison analysis engine which visually shows you how your 1 change affected the scalability of your test. Use the load to validate all of your changes. When you are tuning, you need to essentially be aware of two very important statistics: throughput and response time. This is where the art comes in. You need to tune the environment for maximum capacity without allocating too much “work” for the underlying hardware to handle. If you make this mistake, and we all do, you will see the response times increase. Remember, there are limitations in infrastructure. Nothing is infinite. Tune until your workload is using up most of the hardware resources while delivering acceptable response times.

Tuning can save time, save money, and save the planet (greener deployments use less electricity). The costs associated with more hardware are not just in dollars though, it’s also in administration and maintenance. Also, each new piece of equipment is another potential point of failure (we’ve all seen machine’s go down due to myriad reasons). Tuning is typically the best first approach to address any performance degradation. Expertise is in knowing where to look!

, ,

3 Responses to “Reach Higher Scalability from your Web Applications
Tuning vs. Adding Additional Hardware”

  1. Sreedhar says:

    Hi – I agree to most of the points that are mentioned above. But one thing i would like to bring out is the amount of tunning that you can do on a server in terms threads sizing , connection pooling etc. All these factors more or less dependent on the OS you are running on and the kind of processor you are using.

    So tunning the application to some level is possible by means of the setting/configuring above said parameters but at some stage there will also be a need to increase the hard ware capacity of the server.

    This is because if the application needs memory it need memory by all means and we can minimize this requirement to some extent by optimizing the code or tunning the application to some level and if the performance requirement are not getting after doing all this we should abviously need to increase the hardware capacity of the server.

    So if we look at tunning of the application to high scalability we need to relay on increasing the hard ware capacity as well along the tips/suggestions mentioned above in this article

  2. SD says:

    In my own experience, performance is generally a “hardware problem”. Many times have I seen teams tackle performance issues by tweaking and optimizing application code and application container settings. These people (almost always experienced senior IT specialists) used the same technique you describe and cannot in any way be described as people who don’t know what they are doing.

    The fact is that the time taken to optimize and tweak things introduce risk (both the risk of not gaining enough to justify the effort and the risk of breakage) and do incur costs (in human resources, longer time to market and lost opportunity when people can be assigned elsewhere).

    The lines I usually draw are that application code (ie: the actual text written by people) is for humans. The application code must be readable, maintainable and functionally correct before being “high performance”. If there is a trade-off between performance and “readability”, I usually tip the balance towards “readability”.

    As for application containers, it is the operation teams to come up with standard settings that fit most uses.

    Basically, what I usually tell the people I work with is that they should write “reasonable” code running on application servers with “reasonable” settings. If this is not enough, calculate the amount of money it would take to upgrade the hardware immediately to achieve the desired effect and spend 10% of that amount on optimization effort. After that amount is spent, buy / upgrade the hardware.

  3. WS says:

    @Sreedhar: There are very few situations in which a process requires *a lot* of memory apart if you are processing say 10,000+ user requests in parallel. You may indeed rely on streaming, filtering, limiting data presented to a user and better process data where it fits the best (i.e. often inside the database).

    Once you did all of that, then you may suggest hardware investments (in the context of Web Applications!).

    @SD: Very strange approach the 10% to optimize then buy hardware.

    I can understand the fact that human resources can cost but imagine the performance problem comes from a *scalability issue* as locking mechanisms.

    Throwing more hardware will certainly not solve the problem and may take more than 10% of “the amount of money” required to reach the goal; in fact I even wonder how you can perform your calculations. It may ends up refactoring some piece of code can cost say 70% of this estimated cost to achieve 50x better than if buying the new hardware.

    One very interesting thing I’ve discovered last year was an innovation from LMAX, the Disruptor: http://code.google.com/p/disruptor/

    I just can’t imagine the hardware required to achieve the very same results without this “tuning”.

    My 2 cents

Leave a Reply