#NeotysPAC – Digital Performance Lifecycle: Cognitive Learning (AIOps) by Jonathon Wright

 [By Jonathon Wright]

Let us go back around 4 years ago, I was attending a DevOps Customer Advisory Board (CAB), the discussion kept on drifting back to how important the ability to “Shift Left” was to help catch issues earlier within the lifecycle and failing fast. I remember agreeing at the time, but kept suggesting that I believed “Shift Right” was equally as important.

Shift Right – Cognitive Thinking (DesignOps)

Since then the majority of my talks have focused on exploring this paradigm in further detail. This is what I started to coin back then as DesignOps (i.e Business to Operations), which focused on the business ownership from Design, through Development & Test and into Operations and then back again, whilst supporting omni-directional flow of information. Most importantly was the feedback loop between the Think (Design) and Learn (Operations) which is the core principles of the “Shift Right” methodology.

Back in 2016, I decided published the Digital Manifesto. This explored my thinking around the principles behind each of the segments. Along with the supporting design patterns & blueprints to help organisations implement Enterprise AIOps, to build and test Digital Capabilities such as Artificial Intelligence (AI).

Additional, I also had the opportunity to contribute to the Digital Quality Handbook on Digital Experiences (DX) and the impact that Cognitive Adoption would have on the number of Digital Interactions we would have with Digital Technologies in the future.

The hypothesis is that the introduction of Cognitive Adaptive Technology would considerably reduce the number of Digital Interactions reducing the Digital Grind (time wasted) on repetitive tasks. One of the big drivers would be the maturity / increase of accuracy of Artificial Intelligence within business, what I had been referring to as Enterprise AI and will be chairing a conference next month in London.

The supporting Digital Blueprints and Patterns for each of these Cognitive capabilities can be used to implement for example Machine Learning using standard stacks (i.e SparkML or Neo4J) so well worth checking out the upcoming Digital Blueprints from Eran Kinsbruner which I’m currently contributed too and his latest blog on this subject.

Enterprise AI – Cognitive Learning (AIOps)

After working in silicon valley in the last year with some household names helping them define and implement their A.I. strategies to become Insight-Driven; along with heading up a R&D team building first wave A.I. platforms into Cognitive Adaptive Testing platforms which both Gartner and Forrester recognised as game changing technology within the industry. I moved into the public & private sector to helping governments and help companies build second wave Enterprise AI platforms that will revolutionise the industry.

Last year I also had the opportunity to speak at the TED series conference for the first time on “Cognitive Learning” with the focus of the talk around the importance of the Cognitive Adoption of AI (Thinking) and ML (Learning).

The underpinning Digital Blueprints for Cognitive Adoption of Enterprise AI capabilities such Computer Vision, Neural Networks and Machine Learning require a new design patterns for both the build and test capabilities:

Everything is continuously evolving the tools and techniques that worked yesterday may no longer be the correct approach for tomorrow … my section in “Experiences in Test Automation”

So if this is true, what new approaches do we need to consider when proving out platforms built on these Digital Capabilities?

Recently, I have been exploring Augmented Intelligence to understand the causes (triggers) and effect (actions) of complex toolchains (i.e. Microsoft Cognitive Services or OpenCV vs. Amazon Mechanical Turk ) each requiring me to rethink the axoms of performance engineering has against cognitive technologies.

  • ML Performance Variations
    • Regression (baselining training dataset and then benchmarking the cross validation)
    • Classification (total correct correction divided by the total predictions made)
    • Clustering (Percentage Split (i.e. 66% for training, 34% for performance testing))

The best way I have discovered so far is what I was calling Dark Canary (which is a combination of Canary Testing with Dark Launching) utilizing Kubernetes with Spinnaker then measure the above performance axoms of the above performance variations for each canary or graph based ML training dataset.

To measure behaviour you will at least need metrics from monitoring agents from each Kubernetes node (using something like OneAgent by Dynatrace, Prometheus or the Kubelet API).

Then you will need a Digital Performance Monitoring (DPM) platform that can manage the Enterprise grade Application Performance Management (APM) solution.

This opens up another opportunity to leverage the vast amount of unstructured and structured data captured by the Dynatrace OneAgent technology to provide big data analytics powered by machine learning to help predict systemic failures before they happen.

This can be extended out to Enterprise Server Management platform to support pervasive security by understanding normal behaviour of the ecosystem of ecosystem i.e. platforms, individual processes, network utilization or even data stores (NoSQL, Data Lakes).

This can be boosted by the use of Business Test Transactions (BTT) using synthetic tests to again understand the causes (triggers) and effect (actions) of A.I. platforms.

Intelligent Automation – Cognitive Adaptive Testing (DevSecOps)

Last year I found myself at the Swiss Testing Days (STD) conference speaking chatting with the head of test for google assistant, it seemed like a perfect opportunity to swap modern approaches on how to test content sensitive validation or nouns and verbs consumed by a Personal Virtual Assistant (VPA) and how you would go about modelling out that using something like Model-Based Testing approaches.

The million dollar question of how to test the correlation between supervised learning (inputs, outputs or A to B mappings) and the associated computational (compute) power and data throughput measured in input/output operations per second (IOPS). This helps us better understand the machine learning performance of neutral network against three training sets models using say VectorFlow.

Using the example (above) we can directly understand the regression performance by baselining the training dataset then increasing the training set size along with benchmarking any cross validation.

The ROC curve (receiver operating characteristic curve) is a graph based approach for measuring the performance of a classification process. Then the total successful predictions made divided by the total predictions made (True Positive Rate (TPR) vs. False Positive Rate) helps us establish the training AUC (Area Under the ROC Curve)  performance against each of the training data sets:

  • AUC is scale-invariant. It measures how well predictions are ranked, rather than their absolute values.
  • AUC is classification-threshold-invariant. It measures the quality of the model’s predictions irrespective of what classification threshold is chosen.

With the recent shift from narrow A.I. (supervised learning) towards more general A.I. (unsupervised learning) which can support the transfer of learning from one activating event or system interaction. This knowledge training when combined with reinforcement learning allows the advanced learnings of the behaviours of complex systems.

So let us take a example using of a simple MVC application that already has tests functional (UI) tests that execute in headless-mode within a docker container (for this example I will be using Atata) that easily be executed as part of your continuous testing pipeline (in this case I’m using VSTS).

So the first step is we migrate the functional (UI) tests to performance engineered test (R/R) using something like PerfDriver.io to capture the individual request / response pairs of all the messages with the browser proxy and then export them into a generic format such as HAR, JMeter or WebTest.

Now that these tests are ready to be executed using a performance controller (i.e Neload) or agent (I.e. Taurus) I’ve have quickly spun up a Windows Server 2019 instance that can now execute the load tests against this simple application.

For this example I am using a simple proxy to intercept all the messages sent from NeoLoad. Then I can automatically forward them onto to the target application or I could easily amend the payload in both directions (request and response pairs) to allow me to recreate say a man in the middle or brute force attack.

This triggers the continuous security (CS) platform to detect the attack and prevent any cross site scripting or even SQL injection causing from bringing down the system under test. Using tools like Metasploit, nMap or the Social Engineering Toolkit to create a rootkit on a raspberry pi zero to massive load via a IoT cannon generating a DDoS attack on a single DNS endpoint.

In this case the result is that I have forced the system to crash, causing a  full stack trace of the root cause within the application server based on the manipulated request / response pairs. This allows me to be more creative with how I dynamically create requests and handle responses from the system.

So let us take the simple example above, I am able to read the response from the MVC which could change every time. This could be the results from testing machine learning instead of predicable system behaviours which could be modelled using Model-Based Testing approaches and using assertions to validate the outcomes.

The Model-Based Testing depends on dynamic test data generation to support these flows will need to either synthetically generate the data required by the system or subset and crawl for valid data within the system, utilizing Business Process Automation platforms like Visual Integration Processor (VIP) by (curiositysoftware.ie).

However, testing dynamically build the format of the next request in the serious to be sent to the system based on a of encapsulated interactions. In this example above we are creating a multi-dimensional array of the dynamic content rendered on the screen then using these values to drive the flow (model-based testing) such as the underlining workflow based on a number of trigger events.

Augmented Intelligence – Cognitive Adoption (RobotOps)

The rise of the robots, has been a long time in the making, back in 2015  I have been talking about the impact Enterprise of Things would have on the industry, ranging from drone interception and detection, predicative maintenance using robots within the transportation and energy market to Urban 4.0 smart city projects and predictive crime (minority report) that I had previously been working on.

If my previous experience of testing complex systems ranging from mission critical Nuclear power station to  demand-response algorithms for millions of IoT endpoints for Virtual Power Stations (VPS). The importance of Lifecycle Virtualisation (Service (SV), Network (NV) or Network-Function(NFV)) is more important than ever so tools like Wiremock /  MockLabs or Wireshark (PCAP) are essential to our ability to test earlier within the lifecycle (as per my Gartner presentation back in 2015.

So it has now began the Robotic Process Automation (RPA) tool wars, both sides have been building toolchains over the last few decades and now they are coming together to provide Intelligent Automation (IA) solutions to enable continuous delivery, testing, deployment, security operations and monitoring.

This brings into question why does these tools exist, which is an important question, it is easy to trace back the origin of some of the hacker tools and then the associated tools developed by operations to combat them. Equally, tools built for the purpose of Automation within the Development lifecycle like the very first test automation tool I used call XRunner was driven by the need to test UIs. Then we look at tools developed for operations (like UIPath) to deploy applications compared to developer driven RPA toolchains to enable Business Process Automation (BPA).


My recommendation is to start arming yourself with knowledge and understanding to unlock the Cognitive Knowledge GAP to create Value-Driven Delivery and becoming help become Insight-Driven through the Cognitive Adoption of Digital capabilities and technologies.


Learn More about Jonathon Wright’s presentation

Do you want to know more about this event? See Jonathon’s presentation here.

Leave a Reply

Your email address will not be published. Required fields are marked *