The trend of declaring infrastructure as code has been picking up steam over the last few years, offering a way for DevOps teams to transparently manage and scale cloud infrastructure. Why should the way we manage monitoring be any different? In this article, we address this point and illustrate it with a practical example of Monitoring-as-Code on Checkly.


Historically, IT infrastructure has been provisioned manually, both on premise and in the cloud. This presented several challenges, including fragmented workflows, lack of transparency and scalability issues. In response to these problems, the last few years have seen a shift to the Infrastructure-as-Code (IaC) paradigm, in which large-scale systems are declared in configuration files as code.

A new generation of tools has emerged to serve this use case, the most notable example of which is HashiCorp Terraform. Terraform provides a CLI workflow allowing users to specify what the final infrastructure setup should look like, and then takes care of all the intermediate steps needed to get there.

Terraform can be used to provision infrastructure on many different cloud vendors thanks to its provider ecosystem: each provider maps to the vendor’s API, exposing different resources in a domain-specific language known as HCL.

Monitoring the IaC way

Setting up monitoring can present some of the same issues as provisioning infrastructure. This becomes apparent when we move past the initial rollout or proof-of-concept and onboard multiple products and/or teams, and see our monitoring setup rapidly grow in scope — along with its maintenance needs, that is.

Monitoring-as-Code learns from IaC and brings your monitoring config closer to your application and your development workflows. How? By having it also declared as code, much like you would do with any kind of IT infrastructure.

Why Monitoring-as-Code

What does one gain when moving from a manual to a Monitoring-as-Code approach? The main advantages are:

  1. Better scalability through faster provisioning and easier maintenance.
  2. Better history and documentation: config files can be checked into source control.
  3. Shared monitoring setup visibility (and easier shared ownership) in DevOps teams.

Monitoring-as-Code with Checkly

Users who have just started out will be familiar with creating checks, groups, alert channels and other resources through the Checkly UI. The official Terraform provider enables them to instead declare exactly what their active monitoring setup should look like, and have it provisioned by Terraform in just a few seconds — regardless of whether that means creating tens, hundreds or thousands of resources.

You can find the Checkly Terraform provider on the official Terraform registry.

Monitoring an e-commerce website — as code

How does this all look like in practice? Let’s find out by creating a small monitoring setup for our demo e-commerce website.

Setting up our Terraform project

For our example we will be creating browser checks using Playwright scripts we have previously written as part of our Playwright guides.

First will be our login scenario:

…then our search scenario…

…and finally our checkout scenario.

Let’s start off by creating a brand new folder:

mkdir checkly-terraform-example && cd $_

To keep things easy, we create a subdirectory…

mkdir scripts

…and copy all our scripts from above into separate files, for example login.js.

Next up, we want to create our file and include the basic configuration as follows:

We are ready to initialise our project and have the Checkly Terraform provider set up for us. That is achieved by running:

terraform init

After a few seconds, you should see a similar message to the following:

Creating our first browser checks

In the same file, right below our initial instructions, we can now add resources one after the other. They will be browser checks based on the Playwright scripts we previously stored in the scripts directory. Here is what each resource could look like:

Login resource:

Search resource:

Checkout resource:

Now that our Terraform project has been initialised and we have added some resources, we can generate a Terraform plan by running terraform plan.

Terraform will determine all the needed changes to be performed to replicate our monitoring configuration on Checkly. In doing so, we will be asked for our Checkly API key, which we can find under our account settings as shown below. Not on Checkly yet? Register a free account and enjoy your free monthly checks!

We can expose this as an environment variable in order to avoid having to copy-paste it all the time: export TF_VAR_checkly_api_key=<YOUR_API_KEY>.

We can now finally apply our changes with terraform apply. We might be asked for one final confirmation in the command prompt, after which we will be greeted by the following confirmation message:

Logging in to our Checkly account, we will see the dashboard has been populated with our three checks, which will soon start executing on their set schedules.

Monitoring API correctness and performance

Browser checks are now there to keep us informed on the status of our key website flows. What about our APIs, though? Whether they make up the foundation of our service or they are consumed directly by the customer, we need to ensure our endpoints are working as expected. This is easily achieved by setting up API check resources:

We can now once more run terraform plan, followed by terraform apply to see the new check on Checkly:


Now that we have our checks in place, we want to set up alerting to ensure we are informed as soon as a failure takes place. Alert channels can be declared as resources, just like the checks. Let’s add the following to our file:

We are setting up things so that we are alerted when our check starts failing, as well as when it recovers. But we still need to decide which checks will subscribe to this channel, and therefore be able to trigger the alerts. This is done by adding the following inside the resource declaration of each check, e.g.:

Going through the usual terraform plan and terraform apply sequence will apply the changes on our Checkly account:

We are now fully up and running with our monitoring-as-code setup. Our checks will run on a schedule, informing us promptly if anything were to go wrong. Rapidly getting to know about failures in our API and key website flows will allow us to react fast and mitigate impact on our users, ensuring a better experience with our product.

You can find the complete setup described in this guide on our dedicated repository.

Expanding our setup

As our setup expands, we might want to deploy additional tools to make our lives easier. We could:

  1. Iterate over existing Playwright scripts and create multiple checks while declaring only one resource.
  2. Group checks together to better handle them in large numbers.
  3. Use code snippets to avoid code duplication and reduce maintenance.
  4. Move your workflow to Terraform Cloud to easily collaborate with your team when managing your Monitoring-as-Code configuration.

banner image: “Electric Grid” by Duncan Rawlinson — is licensed with CC BY-NC 2.0.

Head of Developer Relations @ Checkly, core contributor.