Using Keptn Analyses in Argo Workflows

In this blog post we will explore how Keptn can be used within Argo Workflows to execute load tests using a KeptnTask, and analyze their results afterward using the Keptn Analysis feature. Argo Workflows is a workflow orchestration engine built for Kubernetes, which makes it a perfect fit to be used together with Keptn. We will achieve this by using the support for Executor Plugins provided by Argo Workflows. This plugin mechanism is an easy way of integrating custom functionality into an Argo Workflow, and providing further nodes in the workflow with valuable information, such as the outcome of a Keptn Analysis.

Technologies used for this example

For this, we are using the following technologies:

The Keptn Analysis resource defines goals for our metrics and evaluates them.
Argo Workflows is the workflow engine.
Prometheus: Provides monitoring data for the application.

Architecture

The overall architecture of this example is depicted in the diagram below:

Architecture

Our example workflow consists of two nodes, which do the following:

The Load Tests node creates a KeptnTask that executes load tests against a sample application. Once the tests have finished, the results and the time frame of the KeptnTask execution are reported back to Argo Workflows.
The Analyse Results node takes the time frame of the executed load tests as an input parameter and creates a Keptn Analysis for the given timeframe. Using a KeptnMetricsProvider which retrieves the metrics for the application from Prometheus, the metric values relevant for the analysis are retrieved and evaluated. The result of the Analysis is then reported back to Argo Workflows which can then pass on the Analysis results to other potential nodes in the workflow.

Both of the workflow nodes are executed using a simple Argo Workflow executor plugin hosted in this repo. Note, however, that the code in this repo is just a proof of concept and not intended for use in production.

Setting up the environment

In this example, we assume that both Keptn and Argo Workflows are already installed on our cluster. If this is not the case, and you would like to follow this example, please follow the instructions at the respective installation guides:

The next step is to install the Keptn extension for Argo Workflows (unofficial PoC implementation). This is done by applying the ConfigMap that enables the plugin within Argo workflows, and the required RBAC configuration for the plugin to be able to interact with Keptn resources:

kubectl apply -f https://raw.githubusercontent.com/bacherfl/argo-keptn-plugin/main/config/keptn-executor-plugin-configmap.yaml
kubectl apply -f https://raw.githubusercontent.com/bacherfl/argo-keptn-plugin/main/config/rbac.yaml

Defining the load test KeptnTaskDefinition

The load tests against our sample application are executed using a KeptnTask. The related KeptnTaskDefinition looks as follows:

apiVersion: lifecycle.keptn.sh/v1
kind: KeptnTaskDefinition
metadata:
  name: loadtests
  namespace: simple-go
spec:
  container:
    - name: loadtests
      image: curl
      command: ["sh", "-c"]
      args:
        - |
          for i in $(seq 1 600); do
            curl -s http://simple-go-service.simple-go:8080
            sleep 0.1
          done

This task simply creates some load by sending a curl request to the sample application for a duration of one minute.

Defining the AnalysisValueTemplates

Now we are going to define the queries for the metrics we would like to analyse:

The response time of the demo service
The error rate of the demo service

Below are the AnalysisValueTemplate resources, as well as the KeptnMetricsProvider resource that points to the Prometheus API inside our cluster:

apiVersion: metrics.keptn.sh/v1
kind: KeptnMetricsProvider
metadata:
  name: my-provider
  namespace: simple-go
spec:
  targetServer: https://prometheus-k8s.monitoring.svc.cluster.local:9090
  type: prometheus
---
apiVersion: metrics.keptn.sh/v1
kind: AnalysisValueTemplate
metadata:
  name: response-time-p95
  namespace: simple-go
spec:
  provider:
    name: my-provider
  query: "histogram_quantile(0.95, sum by(le) (rate(http_server_request_latency_seconds_bucket{job='{{.workload}}'}[1m])))"
---
apiVersion: metrics.keptn.sh/v1
kind: AnalysisValueTemplate
metadata:
  name: error-rate
  namespace: simple-go
spec:
  provider:
    name: my-provider
  query: "rate(http_requests_total{status_code='500', job='{{.workload}}'}[1m]) or on() vector(0)"

Next, we define the AnalysisDefinition resource that contains the goals for the metrics mentioned above:

apiVersion: metrics.keptn.sh/v1
kind: AnalysisDefinition
metadata:
  name: my-analysis-definition
  namespace: simple-go
spec:
  objectives:
    - analysisValueTemplateRef:
        name: response-time-p95
      keyObjective: false
      target:
        failure:
          greaterThan:
            fixedValue: 500m
        warning:
          greaterThan:
            fixedValue: 300m
      weight: 1
    - analysisValueTemplateRef:
        name: error-rate
      keyObjective: true
      target:
        failure:
          greaterThan:
            fixedValue: 0
      weight: 1
  totalScore:
    passPercentage: 60
    warningPercentage: 50

Note that the AnalysisDefinition used in this example is kept rather simple. If you would like to learn more about the possibilities of the analysis feature of Keptn, feel free to read more about it in this blog post.

Putting it all together

Now that we have defined our Keptn resources for executing load tests and analysing the performance of our application, it is time to put everything together by defining the Argo Workflow, which looks like the following:

metadata:
  generateName: test-123
  namespace: argo
  labels:
    example: 'true'
spec:
  arguments:
    parameters:
      - name: workload
        value: simple-go-service
  entrypoint: main
  templates:
    - name: main
      dag:
        tasks:
          - name: execute-load-tests
            template: keptn-loadtests
          - name: analyze
            template: keptn-analysis
            dependencies:
              - execute-load-tests
            arguments:
              parameters:
                - name: query
                  value: analysis/simple-go/my-analysis-definition/1m/workload=simple-go-service
                - name: start
                  value: "{{tasks.execute-load-tests.outputs.parameters.start}}"
                - name: end
                  value: "{{tasks.execute-load-tests.outputs.parameters.end}}"
          - name: print-result
            dependencies:
              - analyze
              - execute-load-tests
            arguments:
              parameters:
                - name: start
                  value: "{{tasks.execute-load-tests.outputs.parameters.start}}"
                - name: end
                  value: "{{tasks.execute-load-tests.outputs.parameters.end}}"
                - name: result
                  value: "{{tasks.analyze.outputs.parameters.result}}"
            template: whalesay
          - name: print-json
            dependencies:
              - analyze
            arguments:
              parameters:
                - name: details
                  value: "{{tasks.analyze.outputs.parameters.details}}"
            template: print-json

    - name: keptn-loadtests
      inputs:
        parameters:
          - name: query
            value: keptntask/simple-go/post-deployment-loadtests
      outputs:
        parameters:
          - name: start
            valueFrom:
              jsonPath: '{.output.parameters.start}'
          - name: end
            valueFrom:
              jsonPath: '{.output.parameters.end}'
      plugin:
        keptn: {}

    - name: keptn-analysis
      inputs:
        parameters:
          - name: query
          - name: start
          - name: end
      outputs:
        parameters:
          - name: result
            valueFrom:
              jsonPath: "{.output.parameters.result}"
          - name: details
            valueFrom:
              jsonPath: "{.output.parameters.details}"

      plugin:
        keptn: {}

    - name: whalesay
      inputs:
        parameters:
          - name: start
          - name: end
          - name: result
      container:
        image: docker/whalesay:latest
        command: [sh, -c]
        args: ["cowsay 'Analysis result for timeframe {{inputs.parameters.start}} - {{inputs.parameters.end}}: {{inputs.parameters.result}}'"]

    - name: print-json
      inputs:
        parameters:
          - name: details
      container:
        image: alpine:latest
        command: [sh, -c]
        args: ["echo '{{inputs.parameters.details}}' | tee /tmp/result.json"]
      outputs:
        artifacts:
          - name: result
            path: /tmp/result.json
            archive:
              none: {}

  ttlStrategy:
    secondsAfterCompletion: 300
  podGC:
    strategy: OnPodCompletion

This workflow contains the following steps:

execute-load-tests: Starts a new instance of a KeptnTask based on the loadtests KeptnTaskDefinition we created earlier. The result of this step contains the start and end timestamps of the executed load tests.
keptn-analysis: Runs after the previous node is completed, and accepts the reported start and end timestamps as input parameters. This interval is used to create a new instance of an Analysis where the response time and error rate during this interval is evaluated, using the AnalysisDefinition created earlier. The overall result (i.e. whether the Analysis has passed or not), as well as a detailed breakdown of each objective in JSON format is reported back to Argo in the results of the step.
print-result: Takes both the start and end timestamps of the load tests, as well as the overall result of the Analysis as input parameters, and prints a message containing the result.
print-json: Takes the JSON object containing the analysis objective breakdown of the keptn-analysis as an input parameter, and stores the received JSON object as an artifact. This enables other nodes in a workflow to retrieve that artifact and use the information in it as input artifacts. Read more about the concept of artifacts in Argo Workflows here.

Executing the workflow

The workflow is triggered by navigating to the Argo Workflows UI and choosing the Submit new Workflow option:

Triggering the workflows

After some time, all steps in the workflow are completed, and both the time frame of the load test execution, and the result of the Analysis are visible in the Argo Workflows UI:

Workflow result

Conclusion

In this article we have explored the potential of using Keptn to both perform tasks and analyse performance metrics within Argo Workflows, and pass the results on to the following steps. Hopefully this article provides you with some inspiration of how you can make use of Keptn within Argo Workflows - If so, we would love to hear about it, and we always welcome any feedback. If you have questions or run into any kind of issues, feel free to reach out on the #Keptn CNCF Slack channel or by raising issues in our GitHub repository.

Using Keptn Analyses in Argo Workflows

Technologies used for this example

Architecture

Setting up the environment

Defining the load test KeptnTaskDefinition

Defining the AnalysisValueTemplates

Putting it all together

Executing the workflow

Conclusion

Comments