I was delighted to be invited to speak at my first Virtual PAC conference – after speaking at the live event in 2019 and also joining a few of the online meetups this year, this was my chance to complete the set!
My talk this year was about one of those areas of performance testing that has always bugged me – the test scenario design. In this article I first explain my reasons for this and then describe the framework with which we automated some parts around this process.
The test scenario design is an area that in the classic model of performance testing has always been slightly awkward and inconvenient. Now we are in a new era where we look to automate performance tests as part of Continuous Performance Engineering, I wanted to look at what’s happening with the test scenario and to ask if we can do it better.
You can watch the recording (https://www.youtube.com/watch?v=27vHF9dnZBg) of my talk to listen to my thoughts about how these are great times for performance engineers – so many of the limitations we used to run up against have been overcome, or at least reduced, over the past decade or so. Test environment provision may still be an issue, but VMs, cloud platforms, containers and service virtualization have made lives vastly better for the majority. APM tools let us analyze the code internals. And test tool vendors kept adding useful features, not least Neotys of course (think about frameworks in NeoLoad to help maintain your scripts automatically – how many hours of my life could I have saved if I’d had this before!?).
Gradually, we’ve been able to spend less time on the boring stuff, and more time where it matters most. But the test scenario design continues more or less unchanged, like the old-fashioned uncle at your family party.
Why do we need to care about this? Because fundamentally, our test results are only as good as the validity of the test, and there are two key components to that – how realistic the environment on which we run is; and how realistic the load that we put on the environment is. A load test is only a simulation: the more accurate the simulation, the more value we bring.
The environment is, to a lesser or greater extent, often out of our control, and so we develop techniques to account for that. It’s the scenario design that determines the load that our test generates, so it’s natural to be obsessed with making it as close as possible to “real life” load as we can.
(Terminology check: When I talk about “test scenario,” I mean the definition of the numbers of “virtual users” or threads that will make up a load test, with the associated test scripts, load generators, settings and so on.)
If you, dear reader, have been using performance test tools for any significant length of time, you must have come across one (probably all) of these irritations:
- You spend a long time building a test scenario with lots of different “groups/populations” of users, each of which has different properties (different script, different location, network settings, data, etc.. But then each time you want to modify the test, you have to spend hours updating the many different settings!
- During the test, something happens and you want to change the level of load, but you can’t change it after pressing Go. (Maybe you can add more users, but the test becomes unbalanced because of your clumsy manual tweaking, or the users run out of precious unique test data, or are assigned to the wrong load generators etc. . . .) The test tool is too inflexible to allow “on the fly” changes!
- During or after the test, you realize that you made one tiny error when setting up the scenario, hidden among the many screens of settings that you clicked on, and it has resulted in the test design being incorrect, and hours of test run time is now wasted.
- You have to spend a large amount of additional effort when creating the test script to add logic, safeguards, or changing settings programmatically, to get around the limitations of the test scenario at runtime.
We should not forget one other factor which is the terrible usability of the “scenario design” part of most load test tools. Some performance testers probably get a feeling of comfort that they are part of the chosen few who know exactly where to right-click to find the hidden menu options. But I think this is naive – we need to work with developers and other teammates, and enabling them to set up tests will be a big part of that.
In fairness, I should take a second to add that some tools are now adding features that can improve things. With NeoLoad you can control some aspects of the test by making API calls to NeoLoad Web. Then with JMeter there is a feature of the BeanShell which allows it to act as a web server (https://jmeter.apache.org/usermanual/best-practices.html#beanshell_server) which you can call during the test to change the scenario settings, for example, to extend the test duration or add threads etc. I haven’t used this feature myself, though.
We set ourselves a goal to improve the flexibility of the test scenario design, with the following use cases:
- Customer asks us: “We had an issue in production last night, can you quickly run a test with the same volumes to see if we can replicate the issue?” We quickly (automatically if possible) extract those volumes and use them to set up a new test design.
- We want to store performance test metadata, test results and metric data all in one Performance Metrics Database. (By which we mean a central “performance” database that holds all the data that we care about – from transaction response times to monitoring metrics, to performance test metadata and in this case, the scenario design parameters.)
- We want to run a fully automated performance test, including the ability to dynamically change the volumes, think time and other settings – integrated into the CI pipeline – and with a dashboard capability in real time and post-test.
- We want to allow a “self-service” for the project team to design and run their own test without any knowledge of the test tool.
We went through various prototypes and ideas. We had production volumes being read into MongoDB tables. At one stage we used our in-house tool, Performance Studio. But eventually we settled on a set of tools that were readily available to us and the teams we work with:
- Jenkins – available at pretty much every organisation
- JMeter – as a quick easy driver of load.
- Splunk – in use at many of our clients.
The CI tool and the load test tool can be replaced by virtually any equivalent tool (and we intend that it would be when we use it at different clients who use commercial tools). Of these three tools, only Splunk is a core part of the design.
So our framework works as below:
The exciting parts are driven by Splunk apps, developed by the fantastic Altersis Tunisia team – many thanks especially to Monam and Morsi for all the work on this.
To see a demo, check out the recording of my talk (the demo is about halfway through the talk, if you want to skip to that). Here are a few screenshots to whet your appetite.
Splunk App for Scenario Design
This app stores the scenario settings like the number of vusers and pacing. Later on, this info is retrieved by Jenkins and used to update the JMeter test scenario prior to execution.
Jenkins Plugin to Query Prod Volumes
Jenkins will run the given Splunk query and then update the scenario design in the Splunk app based on the outcome.
Splunk Execution Dashboard
Simply a nice Splunk dashboard to see the test progress in real time.
I hope this has got some of you thinking about how you could improve your own test process and I would love to hear any developments in this area. What I hope comes next:
- For Altersis – we want to extend this prototype to use a range of test tools and cover more complex scenario situations.
- For anyone reading this blog – if you have ideas or questions, feel free to contact me!
- For other Neotys PAC members – hope to see you all again soon. Thanks to everyone for the great talks. I learned a lot from the ones I saw and need to re-watch a few!
- For Neotys – Henrik, Stephane and co. Thanks a million for hosting these awesome events. I massively appreciate all the effort that goes into the Virtual PAC and I know all the other speakers and viewers do too. Although honestly, 24 hours is not enough. Let’s keep going and launch this into space with our own permanent, 24/7 rolling performance TV channel!
See you all there – keep the Continuous Performance Engineering rocket flying!
If you want to know more about Alan’s presentation, the recording is already available here.