Using Keptn Analyses in Argo Workflows
In this blog post we will explore how Keptn can be used within Argo Workflows to execute load tests using a KeptnTask, and analyze their results afterward using the Keptn Analysis feature. Argo Workflows is a workflow orchestration engine built for Kubernetes, which makes it a perfect fit to be used together with Keptn. We will achieve this by using the support for Executor Plugins provided by Argo Workflows. This plugin mechanism is an easy way of integrating custom functionality into an Argo Workflow, and providing further nodes in the workflow with valuable information, such as the outcome of a Keptn Analysis.
Technologies used for this example
For this, we are using the following technologies:
- The Keptn Analysis resource defines goals for our metrics and evaluates them.
- Argo Workflows is the workflow engine.
- Prometheus: Provides monitoring data for the application.
Architecture
The overall architecture of this example is depicted in the diagram below:
Our example workflow consists of two nodes, which do the following:
- The Load Tests node creates a
KeptnTask
that executes load tests against a sample application. Once the tests have finished, the results and the time frame of theKeptnTask
execution are reported back to Argo Workflows. - The Analyse Results node takes the time frame of the executed load tests as an input parameter
and creates a Keptn
Analysis
for the given timeframe. Using aKeptnMetricsProvider
which retrieves the metrics for the application from Prometheus, the metric values relevant for the analysis are retrieved and evaluated. The result of the Analysis is then reported back to Argo Workflows which can then pass on the Analysis results to other potential nodes in the workflow.
Both of the workflow nodes are executed using a simple Argo Workflow executor plugin hosted in this repo. Note, however, that the code in this repo is just a proof of concept and not intended for use in production.
Setting up the environment
In this example, we assume that both Keptn and Argo Workflows are already installed on our cluster. If this is not the case, and you would like to follow this example, please follow the instructions at the respective installation guides:
The next step is to install the Keptn extension for Argo Workflows (unofficial PoC implementation).
This is done by applying the ConfigMap
that enables the plugin within Argo workflows,
and the required RBAC configuration for the plugin to be able to interact
with Keptn resources:
kubectl apply -f https://raw.githubusercontent.com/bacherfl/argo-keptn-plugin/main/config/keptn-executor-plugin-configmap.yaml
kubectl apply -f https://raw.githubusercontent.com/bacherfl/argo-keptn-plugin/main/config/rbac.yaml
Defining the load test KeptnTaskDefinition
The load tests against our sample application are executed
using a KeptnTask
.
The related KeptnTaskDefinition
looks as follows:
apiVersion: lifecycle.keptn.sh/v1
kind: KeptnTaskDefinition
metadata:
name: loadtests
namespace: simple-go
spec:
container:
- name: loadtests
image: curl
command: ["sh", "-c"]
args:
- |
for i in $(seq 1 600); do
curl -s http://simple-go-service.simple-go:8080
sleep 0.1
done
This task simply creates some load by sending a curl request to the sample application for a duration of one minute.
Defining the AnalysisValueTemplates
Now we are going to define the queries for the metrics we would like to analyse:
- The response time of the demo service
- The error rate of the demo service
Below are the AnalysisValueTemplate
resources, as well as the
KeptnMetricsProvider
resource that points to the Prometheus API
inside our cluster:
apiVersion: metrics.keptn.sh/v1
kind: KeptnMetricsProvider
metadata:
name: my-provider
namespace: simple-go
spec:
targetServer: https://prometheus-k8s.monitoring.svc.cluster.local:9090
type: prometheus
---
apiVersion: metrics.keptn.sh/v1
kind: AnalysisValueTemplate
metadata:
name: response-time-p95
namespace: simple-go
spec:
provider:
name: my-provider
query: "histogram_quantile(0.95, sum by(le) (rate(http_server_request_latency_seconds_bucket{job='{{.workload}}'}[1m])))"
---
apiVersion: metrics.keptn.sh/v1
kind: AnalysisValueTemplate
metadata:
name: error-rate
namespace: simple-go
spec:
provider:
name: my-provider
query: "rate(http_requests_total{status_code='500', job='{{.workload}}'}[1m]) or on() vector(0)"
Next, we define the AnalysisDefinition
resource that contains the goals
for the metrics mentioned above:
apiVersion: metrics.keptn.sh/v1
kind: AnalysisDefinition
metadata:
name: my-analysis-definition
namespace: simple-go
spec:
objectives:
- analysisValueTemplateRef:
name: response-time-p95
keyObjective: false
target:
failure:
greaterThan:
fixedValue: 500m
warning:
greaterThan:
fixedValue: 300m
weight: 1
- analysisValueTemplateRef:
name: error-rate
keyObjective: true
target:
failure:
greaterThan:
fixedValue: 0
weight: 1
totalScore:
passPercentage: 60
warningPercentage: 50
Note that the AnalysisDefinition
used in this example is kept rather simple.
If you would like to learn more about the possibilities of the
analysis feature of Keptn, feel free to read more about it
in this blog post.
Putting it all together
Now that we have defined our Keptn resources for executing load tests and analysing the performance of our application, it is time to put everything together by defining the Argo Workflow, which looks like the following:
metadata:
generateName: test-123
namespace: argo
labels:
example: 'true'
spec:
arguments:
parameters:
- name: workload
value: simple-go-service
entrypoint: main
templates:
- name: main
dag:
tasks:
- name: execute-load-tests
template: keptn-loadtests
- name: analyze
template: keptn-analysis
dependencies:
- execute-load-tests
arguments:
parameters:
- name: query
value: analysis/simple-go/my-analysis-definition/1m/workload=simple-go-service
- name: start
value: "{{tasks.execute-load-tests.outputs.parameters.start}}"
- name: end
value: "{{tasks.execute-load-tests.outputs.parameters.end}}"
- name: print-result
dependencies:
- analyze
- execute-load-tests
arguments:
parameters:
- name: start
value: "{{tasks.execute-load-tests.outputs.parameters.start}}"
- name: end
value: "{{tasks.execute-load-tests.outputs.parameters.end}}"
- name: result
value: "{{tasks.analyze.outputs.parameters.result}}"
template: whalesay
- name: print-json
dependencies:
- analyze
arguments:
parameters:
- name: details
value: "{{tasks.analyze.outputs.parameters.details}}"
template: print-json
- name: keptn-loadtests
inputs:
parameters:
- name: query
value: keptntask/simple-go/post-deployment-loadtests
outputs:
parameters:
- name: start
valueFrom:
jsonPath: '{.output.parameters.start}'
- name: end
valueFrom:
jsonPath: '{.output.parameters.end}'
plugin:
keptn: {}
- name: keptn-analysis
inputs:
parameters:
- name: query
- name: start
- name: end
outputs:
parameters:
- name: result
valueFrom:
jsonPath: "{.output.parameters.result}"
- name: details
valueFrom:
jsonPath: "{.output.parameters.details}"
plugin:
keptn: {}
- name: whalesay
inputs:
parameters:
- name: start
- name: end
- name: result
container:
image: docker/whalesay:latest
command: [sh, -c]
args: ["cowsay 'Analysis result for timeframe {{inputs.parameters.start}} - {{inputs.parameters.end}}: {{inputs.parameters.result}}'"]
- name: print-json
inputs:
parameters:
- name: details
container:
image: alpine:latest
command: [sh, -c]
args: ["echo '{{inputs.parameters.details}}' | tee /tmp/result.json"]
outputs:
artifacts:
- name: result
path: /tmp/result.json
archive:
none: {}
ttlStrategy:
secondsAfterCompletion: 300
podGC:
strategy: OnPodCompletion
This workflow contains the following steps:
- execute-load-tests: Starts a new instance of a
KeptnTask
based on theloadtests
KeptnTaskDefinition
we created earlier. The result of this step contains thestart
andend
timestamps of the executed load tests. - keptn-analysis: Runs after the previous node is completed, and accepts the reported
start
andend
timestamps as input parameters. This interval is used to create a new instance of anAnalysis
where the response time and error rate during this interval is evaluated, using theAnalysisDefinition
created earlier. The overall result (i.e. whether theAnalysis
has passed or not), as well as a detailed breakdown of each objective in JSON format is reported back to Argo in the results of the step. - print-result: Takes both the
start
andend
timestamps of the load tests, as well as the overallresult
of theAnalysis
as input parameters, and prints a message containing the result. - print-json: Takes the JSON object containing the
analysis objective breakdown of the
keptn-analysis
as an input parameter, and stores the received JSON object as an artifact. This enables other nodes in a workflow to retrieve that artifact and use the information in it as input artifacts. Read more about the concept of artifacts in Argo Workflows here.
Executing the workflow
The workflow is triggered by navigating to the Argo Workflows UI and choosing the Submit new Workflow option:
After some time, all steps in the workflow are completed,
and both the time frame of the load test execution,
and the result of the Analysis
are visible in
the Argo Workflows UI:
Conclusion
In this article we have explored the potential of using Keptn to both perform tasks and analyse performance metrics within Argo Workflows, and pass the results on to the following steps. Hopefully this article provides you with some inspiration of how you can make use of Keptn within Argo Workflows - If so, we would love to hear about it, and we always welcome any feedback. If you have questions or run into any kind of issues, feel free to reach out on the #Keptn CNCF Slack channel or by raising issues in our GitHub repository.