#NeotysPAC – SLO Validation as a Self Service with Keptn Quality Gates, by Andreas Grabner

This year marked my third time presenting at Neotys PAC (Performance Advisory Council). And like the two years before, I was in awe hearing from my peers on how they go about solving Performance engineering-related problems. It reminded me that, while I have been working in Performance testing for 20+ years now, there is still so much I don’t know and so many exciting approaches to solving Performance-related problems.

Every time after leaving a conference like Neotys PAC, I ask myself three questions and the answer to these three questions resulted in the talk I gave this year:

  • Question: What is the top use case in performance engineering that people keep solving?
  • Answer: Provide Automated Performance Feedback to Engineers!

 

  • Question: Why is the same use case solved multiple times by different people?
  • Answer: Huge variety & combination of tools & processes lead to custom implementations!

 

  • Question: How can we make it as easy as possible to increase adoption?
  • Answer: Adhere to a standard and provide implementation as a Self-Service!

In my talk, I presented Keptn, an open-source project that provides Performance as a Self-Service through its Quality Gate capability. Keptn is agnostic to the tools you may have in your organization, and because it is open source, we hope that practitioners will extend Keptn with new capabilities instead of continuing to build custom implementations. Keptn Quality Gates was inspired by many practitioners – but – I want to highlight three that that had the most significant impact:

Now – let’s dive into the details with SLO Validation!

Over the past years, I was fortunate to work with a dedicated team at Dynatrace to bring the Open Source project Keptn to live. Keptn has set out to solve problems we have seen organizations run into as they start automating their delivery pipelines and operations. If you want to learn more about Keptn and how it’s event-based control plane automates continuous delivery and operations, check out the website, the Github project, the YouTube channel, or walk through the Keptn Quickstart on GKE.

At Neotys PAC, we focused on the Automation of Quality Gates and how the Keptn can be used to provide a Performance as a Self-Service feedback option to engineers or even business. All you need to do in preparation is

  1. Install Keptn and setup tool of choice integrations, e.g., APM, Testing, Deployment
  2. Create a new Keptn Project and onboard a service
  3. Provide Test Scripts, e.g., JMeter, Neotys
  4. Provide a list of metrics (SLIs) & how these metrics should be evaluated (SLOs)
  5. Provide Deployment Description, e.g., Helm Charts, CI/CD Pipeline …

Once Keptn is installed, and all necessary configuration files are available, anybody can say: “Hey Keptn: Here is a new artifact! Deploy it and tell me how it holds up against my SLIs & SLOs while under load testing! Ahoy!”

performance as a self service

Keptn enables everyone to get automated performance feedback of a new artifact or configuration change based on well-defined SLIs & SLOs

Once Keptn is done with its work, it will push the results back via, e.g., Slack, MS Teams, Webex Teams, or by calling a custom webhook. This is possible through the Keptn Notification Service.

All SLO evaluation results are also accessible through the Keptn’s Bridge. The following shows a couple of different visualizations that the bridge provides, e.g., heatmap visualization, the individual outcome of a run or trends of metrics across time:

keptn bridge visualization

Keptn’s Bridge gives full access to any quality gate result and provides trending visualization as a chart or heatmap.

Let me sum up the benefits of Keptn’s approach, and why it solves the three questions I raised in the beginning:

  1. Keptn’s event-driven architecture allows you to integrate with any tool that can deploy, run tests, or provide data for SLI & SLO validation. Learn more here on how to build your own Keptn Service.
  2. Keptn provides a CLI and an API that allows you to integrate Keptn into your existing tools and processes, e.g., trigger Keptn from your existing CI/CD pipeline. Explore the documentation for more details on the API & CLI.
  3. Keptn is Open Source; it is built on standards such as Cloud Events, which makes it easily extensible and future proof. Star our GitHub repo and join our community!

To fully leverage the potential of Keptn and to integrate Keptn into the right step of your existing software delivery process let me share some more details on

  • How Keptn Quality Gates work regarding SLIs & SLOs
  • How to integrate Keptn Quality Gates into your existing delivery pipelines such as Jenkins, GitLab Pipelines, Bitbucket, Harness or others
  • How to get started with Keptn

A core building block and capability in Keptn is the Quality Gate!

SLIs from different data providers

Keptn allows you to specify Service Level Indicators (SLIs) and SLOs (Service Level Objectives) for each indicator. When you trigger Keptn to start an evaluation, it will reach out to the configured SLI Provider (this can be a monitoring tool such as Dynatrace or Prometheus, or load testing tools such as Neoload) to query each SLI based on the SLI definition for a given timeframe or context.

Compare SLIs against SLOs

Once all SLI values are retrieved, Keptn evaluates them against your defined objectives in the SLO. In the SLOs, you have the option to set fixed or relative (compare to baseline) criteria for pass and warning. If you don’t specify criteria, Keptn will return the value of that SLI without including it in the overall score calculation!

Total Score Objective

Once the total score (between 0 and 100%) is calculated, Keptn Quality Gate compares it against your Total Score Objective and tells you whether the overall status of the evaluated SLO is passed, warning of failed. The following animation highlights the process of what happens when Keptn is triggered to run a Quality Gate Evaluation:

slis and slos

Keptn automates the querying of SLIs and evaluating them against SLOs across multiple data sources.

Total Score Concept

A key concept with Keptn Quality Gates is that you always end up with an overall score between 0 and 100%. Every SLI that you have listed in your SLO containing a pass/warn criteria contributes to that score with a default scoring weight of 1. That weight can be customized in case you have SLIs, e.g., Failure Rate or Number of SQL Calls that are more important for you than others.

Changing weights of SLIs

The following example shows that # of SQL Calls has a weight of 2, and therefore, it also scores more points in case it falls within your pass criteria. If an SLI falls into the warning range, it gets half of the possible weighted points, and if it falls outside warning, it will contribute 0 points:

SLO Validation

Keptn calculates a total score based on weighted SLO definitions for your SLIs. The overall rating always falls between 0 and 100%, making it easy to see trends.

In my presentation at Neotys PAC, I walked the audience through a more extended example showing Quality Gate evaluation across four different builds, explaining how the individual SLIs impact the overall total score based on the SLOs. The table visualization, as shown below, is also very similar to the heatmap visualization we have implemented in the Keptn’s bridge and is something we have seen from Stijn Schepers in his work:

SLI and SLO example

Keptn Quality Gate across four builds visualizes the power of the cross-test run evaluation and scoring algorithm as it is easy to detect regressions introduced in a new build.

 

Many users we have spoken to love the full end-to-end capability of Keptn, where Keptn can deploy, run tests, evaluate, promote the artifact to the next stage, e.g., from staging into production and even take care of automating remediation, in production.

However, many organizations have invested in their delivery pipelines over the past years, where they have already automated deployment and test execution. What most lack is the automatic evaluation based on SLIs & SLOs. This is where the power of Keptn Quality Gates and the fact that Keptn also provides a CLI and an API that makes it easy to integrate Keptn Quality Gates into existing pipelines that are implemented with tools such as Jenkins, GitLab Pipelines, Azure DevOps, Harness, XebiaLabs or others. The following is an illustration from one early Keptn adopter, Christian Heckelmann, at eResearchTechnology. He integrated Keptn Quality Gates into his GitLab Pipelines by merely calling the Keptn API to trigger the evaluation after his pipeline already did the deploy and executed JMeter tests. What he wants Keptn to do is to query the SLIs from the APM Tool (Dynatrace in his case), validates the values against the SLOs and provide a total score which allows him to either fail or succeed the pipeline:

Integrating Keptn Quality Gates in existing pipelines such as GitLab through the Keptn API.

 

Christian has done a fantastic job in writing a GitLab integration that he has also published on his public GitLab project: https://gitlab.com/checkelmann/dynatrace-pipeline.

If you want to integrate Keptn into your existing pipeline, take a look at the Keptn CLI & API. In my conference slides, I also included instructions that show how to trigger an evaluation and how to wait for the final scoring, as shown here:

Integrating Keptn into your existing pipelines is easy. You can decide between the CLI or the API approach, as shown above.

The best way to get started is by following the tutorials we have online. While Keptn provides an installation option to only install the Quality Gate capability, I highly recommend that you install the full Keptn feature set as it allows you to see how Keptn can also deploy, run tests, promote across delivery stages and even do auto-remediation in production.

Keptn needs to be installed on Kubernetes, where we support different flavors such as GKE, EKS, AKS, OpenShift, PKS, or Minikube. If you want to give Keptn a try and don’t have a k8s cluster, you can follow my Quickstart Tutorial on GKE, where I also explain how to retrieve an extra $200 credit to cover the costs of the GKE cluster.

If you have any questions, if you have feedback, if you have any ideas – please contact us and let us know so that we can learn from you and drive Keptn in the right direction. The best ways for you to join the Keptn Community is

In the end, let me say THANK YOU Neotys for hosting such a great event. Thanks for bringing people together that share their thoughts and experiences openly to improve our profession as performance engineers. Thanks for giving me the platform to spread the word about Keptn. Let’s see where we sail next.

Learn More about the Performance Advisory Council

Want to see the full conversation, check out the presentation here.

 

Leave a Reply

Your email address will not be published. Required fields are marked *