Performance Monitoring Fixes: To Go Please!

America runs on quick fixes and convenience. For many of us, the faster something gets done, the better. It began with fast food restaurants and continues today with self checkout lines at the grocery store, instant social media feedback, and Uber. Instant gratification has become the norm.

This same attitude applies to performance testing. When you spot poor performance in your environment, sometimes all you want to know is exactly how to quickly fix your application, web app or ecommerce site. Waiting around for things to sort themselves out is hardly an option. In the age of short-term deadlines, there’s only so long you can push things to the back burner. You’ll need to whip out your little book of quick tricks.

While poor system performance occurs for any number of reasons (poor code, understaffed teams, inadequate legacy systems), this week’s post should help you quickly diagnose and fix a few common problems, while setting yourself up for a more stable future at the same time.

Problem 1: Clients Behaving Badly

Modern application frameworks have made it very easy to build not only powerful back-ends, but also rich, web-based user interfaces that are pushed out to the client in real-time. Often this involves a lot of data being transferred to the workstation or mobile device. When this occurs, Javascript code will often be required to process a tremendous amount of information in the form of image manipulation, rows of data, or complex logic – which can cause pauses that last for seconds before the client’s display updates. This is especially true with mobile devices as there is so much variety among devices that the client is often required to do a bunch of processing just to figure out how to render itself properly.

Remember that the response time of the server is not the same as the response time of the app from the user’s perspective. You must measure the true user experience by incorporating techniques for mobile emulation and cloud testing into both load testing and simulated user testing across the QA and production environments.

Problem 2: Overly Complex Servers

Poor performance is often the result of problems happening in the application layer. Modern systems can be highly complex – constructed with multiple tiers of information systems and business logic. This is especially true if they connect to 3rd-party web services or legacy systems on the back-end.

To achieve scale, you’ll often have multiple distributed databases communicating with load-balanced front-end servers and shared application servers to make the system scale for lots of end-users. Throw in DNS servers, caching systems, message brokers, and the like, and you’ve got a pretty complicated order.

To control the chaos, you’ll want to coordinate as much as possible between your QA and Ops teams. Ensure that the right internal monitors are in place, and build simple dashboards so you know where to focus. Take a look at simulated users and other tools that can point to troublesome transactions, helping you find problems and figure them out before they affect real users.

Problem 3: Disorganized Databases

An application that is created in a perfect environment with a small amount of data can run perfectly. But as we all know (and can validate through countless stories), this means nothing until the application achieves scale.

Application servers can deadlock a database, bringing the app to its knees. Other problems include poor query optimization, where one application function triggers multiple data requests and transfers of data. Or a database may take too long to return results due to poor indexing.

In each of these cases, if you know the problem then the fix can be straightforward. You just have to know. An accurate and solid performance monitoring system with well-designed test cases can help you find and fix problems fast.

Problem 4: Unexpected Traffic

Here comes that dreaded spike in user traffic! It can bring an entire system down, just when you need it to be up most. Call it the curse of success. Traffic spikes are when you should be capturing new users and revenue, but if your site is down then that’s not going to happen. We hate to say ‘we told you so,’ but this is exactly why you need a comprehensive load testing plan in the first place.

Doing so will help you prepare for spikes – including spotting when they are happening, and knowing how your system will behave when they occur. You should also coordinate with your marketing team to know when grand-scale advertising activities are taking place. Additionally, set up a good structure for auto-scaling your environment when it reaches peak capacity. Make sure you’ve done your testing in production, so you know that the real environment can handle what you’ve thrown at it. If you’ve done all this, grab your cape. You’re now ready to capture all that traffic and be a hero!

Problem 5: Shared Systems

Today, chances are your apps are virtualized at some level. Virtualization has become a mainstay of modern application environments, due to its benefits for performance optimization, security, scaling, backup, configuration, and privacy.

However, some problems that arise in a virtualized environment can be difficult to troubleshoot. For example, if you are running multiple apps on a shared system, then one app can easily affect another. In a cloud-based or hosted environment, the app that’s causing problems may not even be yours. You probably have very little visibility into how problems on a shared server are impacting your app, but good simulated user testing and other monitoring systems can at least give you a heads up. If you can’t explain a performance problem that you are experiencing on a shared server, be sure to investigate this with your Operations team or hosting provider.

Think Quickly But Aim For High Quality

We’ve all been there. You take a shortcut and it ends up costing you more time than you planned, becomes inefficient and you end up with poor quality. Don’t fall into the same trap with your performance monitoring. Plan ahead by using load testing and application performance monitoring products. When a problem strikes, take time to review the situation before jumping right in. A few seconds of consideration can save you hours of work down the line.

Photo by ebru

Leave a Reply

Your email address will not be published. Required fields are marked *