#NeotysPAC – Tableausaurus Rex: Analysing JMeter results using Tableau Desktop, by Stephen Townshend

Thanks again to Neotys for putting on an outstanding event. This was my sixth PAC event, and I really enjoyed presenting a mostly hands-on demonstration this time.

Over the previous five PAC events both Stijn Schepers and I have spoken frequently about a tool called Tableau. Given the theme of Jurassic PAC, I wanted to go back to the beginning and introduce what Tableau is and how to use it.

What is Tableau? 

As far as I’m aware, a performance expert named Richard Leeke was the first person to use Tableau for analyzing performance data. I worked with Richard in a previous role, and since then I have used it nearly every day I’ve worked.

Tableau is a company that builds data visualization software. They have several products on the market, but the one I am mostly going to talk about is Tableau Desktop, which is a desktop application you use to connect to data sources to analyze and explore them.

Tableau Desktop was not built for performance analysis. It is used primarily in the business intelligence/data analytics space. It just so happens to be very good for exploring performance data.

Tableau Desktop is a commercial tool (there is a license fee to use it). There is a free trial license you can use to test it out, but it’s worth checking if your organization already has licenses you might be able to leverage (through your data/analytics team).

Why use Tableau for load test analysis?

So we’ve established what Tableau is . . . but why use it for load test analysis? Why not just use the out-of-the-box results analysis built into our load testing tool? For me there are two reasons.

Firstly, you’ve heard me say before how important it is to look at raw data. It’s very difficult to know what the system behavior really is if you only look at aggregated data (e.g., averages, percentiles). Most load testing tools do not provide raw data analysis (e.g., scatter plots). This is reason enough for me to seek out an external tool for this purpose.

Secondly, as a performance engineer I want to interrogate data in order to investigate an issue. I want to be able to zoom in on areas of interest, filter out just the data I’m interested in, and combine different data fields to see the relationship between them. Tableau is very good at this kind of ad-hoc or exploratory analysis. I’ve used many load testing tools over the years, and in general they do not provide the level of analysis I need.

Connecting to your data

So that’s Tableau and why I use it, so without any further delay I will explain step by step how to connect to and analyze an out-of-the-box JMeter JTL results file.

In case you’re not familiar with JMeter, a JTL file is just a CSV (comma separated value) file that contains column headings in the first line, followed by rows of data for every sample recorded during your test.

When you open Tableau Desktop, the first thing you want to do is connect to a data source. Tableau can connect to a huge variety of data sources, including file types, databases, and even APIs like Google Analytics. You can find a list of currently supported data sources here.

In our case, we just want to connect to a Text file data source (our JTL file), so go ahead and click that.

Navigate to your JTL file and select it (you will have to change the file filter to All Files unless you rename your file as a .csv). On the next screen hit the Update Now button, and you’ll see a table of your JMeter results data.

Here you can see each field and the different kinds of values recorded during your test.

Scatter plots of raw data

Now that you have connected to the data, it’s time to start our first worksheet. Click on the Sheet 1 worksheet tab down the bottom. Right-click the Sheet 1 tab and go to Rename and call this worksheet “Transaction Scatter.”

Before we build the chart let’s take a quick moment to learn some key Tableau terminology. Firstly, you’ll see on the left-hand side two lists under the headings Dimensions and Measures. For now, it’s not important what the difference is between a dimension and a measure. What you do need to know is that all the columns from your JTL results file will be listed here under one of those two headings (plus a few extras). Shortly we are going to be dragging and dropping these onto the workspace to create graphs.

Tableau has the concept of a shelf. Shelves are where we drag and drop fields to create our graphs. The Columns and Rows shelves at the top are the most obvious shelves but there is also a Color shelf where you can change the color of what you are graphing based on the value of a field, or the Size shelf to modify the size of things based on a field, as well as several others. This will make sense shortly once we start building our graphs.

Lastly, Tableau has the concept of a filter. You can filter anything at all to focus your graph. For example, say you had a chart which shows the 90th percentile response time. You could apply a filter to only include samples that did not receive an error, which is a pretty common use case.

The first chart we want to build is a scatter plot of the raw Transaction Controller response times. The way I built my JMeter test here is that I have Transaction Controllers representing each user action I’m simulating, and these have one or more HTTP Requests as children.

I also use the suffix “_tx” as a naming convention to identify a Transaction Controller. What we want to do is plot the raw response time of just the Transaction Controllers, not the HTTP Requests.

To start building our scatter plot, we need to build a timeline of all the Transaction Controller samples collected during the test. The JTL file has a field called timeStamp which we will use. This is a 13-digit epoch timestamp — in other words, the number of milliseconds passed since midnight on 1/1/1970.

If you right-click and drag the timeStamp field onto the Columns shelf and pick the first option (the raw data) you will see a timeline of samples:

This is a start, but it’s not ideal. I can’t see what day the test was run, or what time. What I want to do is convert this timestamp to a Tableau Datetime format.

Firstly, for readability let’s rename timeStamp to “Sample Start Timestamp (UTC)” because JMeter timestamps are recorded in UTC time. You can rename the field by right-clicking it and selecting Rename.

Next, we want to create something called a calculated field. A calculated field is where we add a new field to the workbook that applies some logic to an existing field to change it. Right-click your freshly renamed field Sample Start Timestamp (UTC) and go to Create > Calculated Field… 

Name this new field “Sample Start Datetime (UTC)” and enter the following formula:

DATEADD(‘second’,int(([Sample Start Timestamp (UTC)]/1000)),#1970-01-01#)

What this formula does is convert the 13-digit timestamp to a Tableau Datetime type. Click OK and you’ll see your calculated field is now in our list of dimensions:

Right-click your new field and select Convert to Continuous as this is a continuous timeline of dates, not a discrete list. If you go back to the Data Source tab down the bottom and click Update Now, you will now see your new field in the table, which is already looking more human-readable than the 13-digit timestamp:

Go back to your Transaction Scatter worksheet. We are almost there, but we have one more step to take. I ran this test in New Zealand, which at the time of running this test was 12 hours ahead of UTC time. So, let’s create another calculated field.

Right-click Sample Start Datetime (UTC) and go to Create > Calculated Field… and this time name your new field “Sample Start Datetime (NZ)” and give it the following formula:

[Sample Start Datetime (UTC)]+(12/24)

The value “1” in a Tableau Datetime field represents one day, so we add half a day to convert the time zone (12 hours / 24 hours). Convert your new calculated field to continuous as you did with the previous one.

Drag the existing field off the Columns shelf and right-click and drag Sample Start Datetime (NZ) onto it instead (picking the first option for raw data). We now have a human-readable timeline of when the test was run:

We now have the timeline of events, but this currently includes all samples captured during the test – both Transaction Controllers and HTTP Requests. Luckily, JMeter’s JTL result files contain a field called dataType. We can use this to identify whether a sample is a transaction or a request.

Right-click the dataType field under Dimensions and pick Show Filter. This will add a couple of checkboxes on the right-hand side. You can tick or un-tick these to instantly filter the data you are displaying.

If the sample is an HTTP Request, it will have a dataType of “text.” If the dataType is blank (Null), then it is a Transaction Controller. We want to only look at Transaction Controllers so un-tick “text” to exclude HTTP Requests.

Next, we want to plot how long each transaction took during the test. The JTL results have a field called elapsed which is the response time of each sample in milliseconds. Rename elapsed as “Response Time (ms).”

We could use this to create our scatter plot, but in this context I prefer to show the response time in seconds. Let’s use another calculated field.

Right-click on Response Time (ms) and go to Create > Calculated Field… and call this new field “Response Time (s).” Enter the following formula to convert milliseconds to seconds:

[Response Time (ms)]/1000

Now right-click and drag Response Time (s) onto the Rows shelf and select the first option. You suddenly have a basic scatter plot showing the system behavior:

We are getting somewhere, but we can make this better. I prefer changing the shape to a “+” marker, so click on the Shape shelf and select it:

Next, let’s add some color. There is a field called label. This is the sample label, what you named each sampler as in JMeter. Rename it to “Sample Label.” Drag this field onto the Color shelf and suddenly we see a lot more information:

If you hover your cursor over any point you can see what user action or API was being triggered. You can use any field you like to inform the color. There is another field called success, which is whether a sample succeeded or experienced an error. Drag that onto the Color shelf.

The default colors aren’t ideal so right-click this…

…and choose Edit Colors. Let’s make “False” a dark red and “True” a light green. Many people are color blind and red/green colorblindness is the most common. If you do use red and green in your charts, make sure they have different darkness/brightness:

Now we are starting to piece together some patterns. There is a six-minute window near the start of the test where we see a lot of failed transactions and not many successful ones. Let’s find out more about that.

Let’s make a new worksheet. Up at the top in the toolbar, click the duplicate worksheet button (). Rename this one “Request Scatter.” Because this time we want to see requests only, switch the dataType filter so that “text” is included and “Null” is excluded:

There is another field in our JTL file called responseCode, which is the HTTP Status Code returned by the server. Rename it to “HTTP Status Code” and drag it onto the Colors shelf.

It doesn’t look right yet, so we’re going to modify the color scheme. Right-click this…

…and choose Edit Colors… Manually select blues or greens for the 2xx responses, oranges for the 4xx responses, reds for the 5xx responses, and dark grey for “Null.” 

What is a “Null” response code in JMeter anyway? It means that no response came back from the server. Either a connection could not be established, or a connection was terminated mid-transaction, or maybe JMeter itself experienced an error. My color scheme looked like this when I was done:

In order to make our chart look the best, we want to change the order in which the points are rendered so that the failed requests are on top (so we can focus on the issues). Right-click on the colors again and select Sort… 

Tick “Descending” to reverse the order, but we are still left with “Null” hidden behind everything. Pick “Manual” from the Sort by… drop-down list and manually move “Null” to the top by clicking the up-arrow button. The scatter we have now is very interesting:

During that six-minute window where things didn’t go so well, we can see that there were a lot of HTTP-503 (service unavailable) errors, with a few “Null” (connection errors) scattered amongst them. Because of this chart, we can describe very clearly what the system behavior was at this time.

At this point, with the two charts we have already built alone, you are probably doing better load test analysis than 99% of performance testers. Why? Because you’re viewing and analyzing raw data and applying useful filters to understand system behavior in a way that most load testing tools cannot.

Percentile Response Time

Another very common metric I need to report on is percentile response time. This is often how we report on user experience, and percentiles are very commonly defined in non-functional requirements.

Let’s create a fresh new worksheet. Click the New Worksheet button in the toolbar (). Rename this on to “Percentile Response Time” down at the bottom.

Drag Sample Label onto the Rows shelf. Then right-click dataType and select “Show Filter.” Un-tick “text” so we are only looking at Transaction Controllers.

Let’s start by plotting the average response time. Right-click and drag Response Time (s) onto the Columns shelf – but this time pick AVG(Response Time (s)):

Already we have something of value, a chart that shows the average response time for each transaction:

The most common percentile metric I report on is the 90th percentile, so let’s use that. To get the percentile response time, all you need to do is click the little down arrow on AVG(Response Time (s)) in the Columns shelf and go to Measures > Percentile > 90:

That’s it! We now have a bar chart showing the 90th percentile response time for each transaction. We’ve achieved the basic goal, but we can make this better. What if you want to be able to freely pick the percentile to report on? Maybe on one project the NFRs are represented as 95th percentile instead, but you want to re-use this Tableau Workbook.

We can do this quite easily using a parameter. A parameter in Tableau is like a user-defined variable. Right-click under the Measures list and pick “Create Parameter.”

Name your parameter “Percentile Value,” set the display format to percentage, and define a range between 0 and 1 as shown below:

Click OK. Right-click your new parameter in the list and pick Show Parameter Control… to make it appear on the right-hand pane. Click the little arrow drop down on it and change it to Type In.

It’s not plumbed in yet, but you can type any number between 0 and 1 into this box. For example, enter “0.95” into the field and it will change it to 95%.

To plumb it into our worksheet, all you need to do is double click on the field in the Columns shelf so it turns into an equation you can type into. It will currently have the formula:

PERCENTILE([Response Time (s)],0.90)

Change it to:

PERCENTILE([Response Time (s)],[Percentile Value])

CTRL-click and drag our percentile field from the Columns shelf onto the Label shelf. This will put a number at the end of each bar which is helpful. CTRL-clicking and dragging duplicates a field without taking it away from its original location. Our chart now looks close to done:

Two last things:  down at the very bottom, if you hover your cursor, is a little sort button (). Click it until the rows are sorted in descending order (slowest at the top). 

Lastly, I often don’t want to clutter up a report with transactions that are quick. I want to focus on the ones that take time. We can add a filter to exclude any transactions that have a percentile response time less than 1 second:

  1. Click the little down triangle on our percentile field (in the Columns shelf)
  2. Select Filter… 
  3. Go to the At Least tab and enter the value “1”
  4. Click OK 

Your chart will now only show response times greater than a second:

Throughput

Another important aspect of load test analysis is throughput. For one thing, you want to make sure you applied the right load during your test. The first chart we are going to build is going to tell us transaction throughput

You might have noticed that in the list of Measures is a field called Number of Records. This wasn’t in our JMeter result file; it’s something Tableau created for us. We can use this to count how many records were in the results:

  1. Create a new blank worksheet by clicking the icon in the toolbar 
  2. Rename the worksheet to “Transaction Throughput”
  3. Drag Sample Label onto the Rows shelf
  4. Right-click datatype in the Columns shelf and pick Show Filter 
  5. Un-tick “text” to only display Transaction Controllers (as we’ve done previously)

We currently have a list of transaction names, but we want to narrow this down to the key transaction of each scenario. In my example I am testing an insurance system. I have a Motor Quote scenario, and in order to measure how much load I applied during my test, I want to know how many times the final “quote” operation occurred.

This will depend on your test suite and your own discretion. To do this:

  1. Click on Sample Label in the Rows shelf and select Filter… 

Click the None button to de-select everything, and then go through and tick the key step of each scenario

My worksheet now has a small list of hand-picked transactions (one per script/thread group):

To look at the throughput, simply right-click and drag Number of Records onto the Columns shelf and pick SUM(Number of Records) from the list.

Now CTRL-click and drag SUM(Number of Records) from the Columns shelf onto the Label shelf to finish it off:

It might not seem like much, but this is an extremely useful chart. This is what I compare with the workload model to make sure I’ve applied the right load.

Another kind of throughput is the request throughput per second, which is another valuable thing to see: 

  1. Create a new blank worksheet and rename it “Requests per Second”
  2. Right-click and drag Sample Start Datetime (NZ) onto the Columns shelf and pick SECOND(Sample Start Datetime(NZ)) 

3. Right-click and drag Number of Records onto the Rows shelf and select SUM(Number of Records) 

4. What we see at the moment is a combination of HTTP Requests and Transaction Controllers, so once again right-click datatype and select Show Filter and then un-tick “Null” so we only see requests:

This is a very interesting graph. What it tells us is that during that six-minute window when things were failing a lot, the load applied was almost 300 requests per second! That’s much higher than the load we intended.

In my particular test, this is partly due to an issue, and partly due to a poorly designed test script. The way you control load (pacing) in JMeter is by using the Constant Throughput Timer. In my script the users…

  1. Go to a homepage
  2. Log in
  3. Loop through x number of business steps

I have my Constant Throughput Timer inside the loop. This means if the homepage is the thing throwing an error, the test will start a new thread and try again – without any delays.

Errors and failures

We’ve already gathered a little bit of information about the failures during our test. We know there was a period of HTTP-503 errors, but what if we wanted to know which specific steps during our test failed?

  1. Create a new blank worksheet and name it “Transaction Failure Rate”
  2. Drag Sample Label onto the Rows shelf
  3. Right-click dataType and click Show Filter… and then un-tick “text” to only show Transactions
  4. Right-click and drag Number of Records onto the Columns shelf and pick SUM (Number of Records) 
  5. Drag success onto the Colors shelf

What we have at the moment is a bar chart showing how many requests passed and failed, but it’s hard to read because of how many times the homepage was called:

We can fix this by right-clicking on SUM(Number of Records) in the Columns shelf and going to Quick Table Calculation > Percent of Total and then once again right-clicking SUM(Number of Records) in the Columns shelf and going to Compute using… > Table (across). We now see the % success/failure rate of each transaction. You can improve it by CTRL-clicking SUM(Number of Records) onto the Label shelf:

One interesting observation here is that the homepage failed over 99% of the time.

The last chart we are going to build is going to show us a timeline of what HTTP status codes occurred during our test:

  1. Create a new blank worksheet and rename it “Status Codes per Minute”
  2. Right-click and drag Sample Start Datetime (NZ) onto the Columns shelf and pick the continuous version of MINUTE(Sample Start Datetime (NZ))

3. Right-click dataType and un-tick “Null” so we only see HTTP requests

4. Right-click and drag Number of Records onto the Rows shelf and pick SUM(Number of Records) 

5. Click the little drop-down and select the Bar chart type

6. Drag HTTP Status Code onto the Color shelf

We now have an very informative chart:

So let’s talk about what happened during this test. The test starts out OK, but then every request begins to be returned an HTTP-404 Not Found error for nearly a minute. After that, every request receives an HTTP-503 Service Unavailable error for the next five minutes.

What actually happened is that there was a deployment (of both infrastructure and the application) during the middle of my test. Initially there was no infrastructure, which is where the 404s come from. Once the web server was deployed and running, it began to throw 503s because the application behind the scenes was not yet up.

Summary

In this blog we have used a purpose-built data analysis tool to gain a really clear understanding of the system behavior during our test. And although this is already beyond what most load testing tools offer us, we are only scratching the surface of what is possible here.

I know I promised Henrik I would demonstrate the infamous tornado scatter chart… but this is already a very long blog… instead I will post a separate blog specifically explaining tornado scatters.

As always, if you’re not looking at raw data, you do not truly understand the system behavior. If your tools don’t provide you the means to analyze raw data, find a tool that will. As Stijn said in his presentation, load testing tools are purpose-built for applying load. There’s nothing wrong with picking up a tool purpose-built for data analysis to supplement your load testing tool.

If you want to know more about Stephen’s presentation, the recording is already available here.

Leave a Reply

Your email address will not be published. Required fields are marked *