[By Joerek van Gaalen, Computest]
In November 2017, Neotys organized the PAC, the Performance Advisory Council. A great selection of several senior performance engineers came together for this convention. The goal of the meeting was to exchange ideas, have discussions and learn from each other through presentations of each another. The topics for this event were narrowed down to DevOps and shift left. Topics in this area are mostly about testing earlier in the development process. So, I thought it would be nice to share some ideas with you, what a performance engineer can actually do and learn from the production part. The ‘right’ part of the development process.
The general role of the performance engineer in a waterfall method was to run tests in the testing or acceptance phase right before going to the production phase. The performance engineer was in the middle of the phases from development to production. The tasks mostly covered setting up the tests like gathering acceptance criteria, scripting the user processes, testing and reporting back the results. In most cases this means that it comes down to basic testing.
The role of the performance tester is transitioning from performance tester to all round performance specialist. The tester will take ownership of the ‘performance’ aspect of the application and the infrastructure of the application he or she is part of. And by being able to take ownership and responsibility of the performance, there is a lot more things that the performance specialist can and should do. Try thinking of the performance specialist, going from “performance testing” to “performance assurance”.
The production environment
The environment that is very valuable to any performance engineer, is the production environment. In many cases this environment is neglected by performance engineers.
Production can give you valuable feedback for your tests, development, operations and the business. For example, making use of usage analytics, capacity management and also monitoring network, system, synthetic and real user data. Also, troubleshooting is an example of something the performance specialist could be more involved in. Please do not interpret me incorrectly, I’m not saying every performance engineer is, but in most cases, there is a certain distance.
If you are looking how the production environment is actually used by your users, be sure to be using analytics to your advantage. It can help you to improve your tests, for instance, to make them more realistic or to help you to evaluate the requirements and criteria that are determined for the application. In my opinion, the requirements should be realistic with real life, or realistic expectations that come close to real life.
Usage analytics can be gathered from Google Analytics or webserver logs. Be thorough when analyzing what the users do. Example questions are: What is the distribution and hit ratio on certain pages or transactions. What are the session times, their average waiting times per page? Try to avoid using averages over a month and even a day. Look at peak hours. Are their different ‘events’? Like the first day of the month, the users focus on completely different transactions than other days. When everything is averaged over a longer period of time, you can misrepresent what really happens and it will result in unrealistic distributions and usage patterns.
These statistics can be very helpful. If they are significantly off, analyze these statistics and alter your load model.
Other log files like (long) database queries and application errors can help you to understand what is happening in production. Do you see the same results in your tests or are there errors in production that you have never seen before in your tests? You should rethink and figure out why there seems to be a difference. It could be a difference in test data or untested functionality.
System monitoring in production is typically something for the system administrators. But as a performance engineer, it could be worthwhile to monitor these statistics too. Compare these numbers with your test results. How do they differ and why? It can help you validate your tests. If for example, CPU-usage is far off at a certain number of concurrent users, then your test might not be realistic. Or if your test environment is scaled differently compared to production, the system monitoring results can help you to better understand the numbers and make it easier to translate the results from test compared to production.
Another role of the performance engineer could be capacity management. Because the performance engineer knows the limits of the application (I hope…), he or she can analyze the trends in production, see how things are evolving and predict when issues could arise and act accordingly.
Also, with the knowledge of a performance engineer, you can help the system administrators to explain certain patterns, such as determining what the root-cause of an issue could be and how to improve performance. And eventually advice on an improved environment setup.
Synthetic monitoring (simulating a user on the application every interval) is a worthy proactive monitoring for the performance specialist. It gives results from a constant variable, like the same browser, source network, client machine, etc. and therefore a perfect situation for comparing performance over time, at any time. The performance specialist should be involved in this type of monitoring. First of all, his expertise can be used to create and maintain the synthetic scripts. Second, he is the one with the expertise to analyze the results and give feedback to dev or ops.
A great way to get the most out of synthetic monitoring is to also monitor your acceptance environment. Sometimes changes on the infrastructure or setup are not passed through the continuous integration process and they are not validated. With constant synthetic monitoring on an acceptance environment, it is easy to spot differences at the precise moment of the impact. Also, changes on the application could break the scripts, with this knowledge the scripts can be fixed in time before going live and prepare the scripts for production.
Real User Monitoring
In order to keep track of the performance of your production environment, one of the best ways is Real User Monitoring. The results combine business transactions, usage data, but also detailed response timing distributions and system monitoring. Usually, these tools are implemented in a production environment, but this tooling could enhance your performance tests as well. Mainly for code and database profiling.
The performance specialist should be fully committed to this tooling, the analyzing of the results and gathering feedback to ensure further improvements.
It’s important to test the impact of such solutions on increased response times. In my experience, there is more than 5% increase in response times as advertised. Sometimes it could go up to +25%, depending on the code and implementation.
The production environment is such a dynamic environment it is likely that issues popup that you have never seen. I think it’s a good practice that the performance specialist is involved in troubleshooting errors in production. As a specialist, you can probably translate the errors or characteristics to the root-cause of problems/trends. Perhaps problems are related to certain load or concurrent usage, so probably you can reproduce the errors in your testing environment. If so, the solution is within reach and verifiable.
Help the ops, or even be part of the ops in the team.
As a performance engineer, there is so much you can do in production. There is much more than testing. Make the best out of your production environment and benefit from it. Your role will change from tester to performance assurance specialist.
Joerek van Gaalen is specialist in performance testing since 2005 and with many very large scaled (web)applications. Between 2011 and 2013 he worked as a technical lead of performance test projects. Since 2014 he is CTO Performance at Computest. In that role, he seeks to new techniques and methods related to performance testing in the all changing environment of IT.
Learn More about Performance Testing & Performance Assurance
If you want to be part of the next Neotys Virtual PAC, submit your paper!