Op Pack Tutorial

A step-by-step example of using Shoreline's Terraform Provider to create a simple Op Pack.

Prerequisites

Install Terraform

Terraform is available as a binary package for most operating systems. Follow these steps to install Terraform on your local machine.

  1. Visit the official downloads page

  2. Download the appropriate package for your operating system and architecture

  3. Unzip the downloaded file

    unzip terraform_1.0.2_linux_amd64.zip
    
  4. Move the terraform binary to /usr/local/bin/ for Mac/Linux systems:

    mv ~/Downloads/terraform /usr/local/bin/
    

    On Windows, move the terraform binary to a directory defined in your PATH system variable.

  5. Verify the installation

    $ terraform --version
    Terraform v1.0.2
    on linux_amd64
    

Install the Op CLI

You'll need a local Op CLI installation to authenticate with your Shoreline cluster.

Authenticating

You'll need to authenticate with the desired cluster via the Op CLI before the Shoreline Provider functions.

The auth Op command provides auth functionality within the CLI.

Start by opening the Op CLI:

oplang_cli

To authenticate with a Shoreline cluster use the auth <url> command, where <url> is the location of your Shoreline cluster API:

op>
auth https://acme.us.api.shoreline-acme.io
This command opens a browser window and authenticates with the identity provider configured with your Shoreline environment, e.g., Okta.

After you authenticate, the CLI prompt indicates which Shoreline cluster you've connected to:

op:acme-test-shoreline>

The .ops_auth.yaml file locally stores all Op auth tokens and their respective Shoreline API endpoints.

By default, it is located in /home/<username>/.ops_auth.yaml.

auth:
  - Token: <token>
    Url: https://acme.us.api.shoreline-acme.io
  - Token: <token>
    Url: https://acme2.us.api.shoreline-acme.io
token: <token>
url: https://acme.us.api.shoreline-acme.io

Alternatively, you can manually define the Shoreline cluster URL and auth token by setting SHORELINE_URL and SHORELINE_TOKEN environment variables.

$ export SHORELINE_URL=https://acme.us.api.shoreline-acme.io
$ export SHORELINE_TOKEN=<token>

Create a Configuration File

Terraform configurations can be complex and span multiple files, but the simplest configuration is a single .tf file and is typically named main.tf.

  1. Create a new main.tf file

  2. Add the following content to the main.tf file:

    terraform {
      required_providers {
        shoreline = {
          source  = "shorelinesoftware/shoreline"
          version = ">= 1.0.7"
        }
      }
    }
    
    provider "shoreline" {
      # Set to the Shoreline cluster API URL, e.g. https://acme.us.api.shoreline-acme.io
      url = "<CLUSTER_API_ENDPOINT>"
    }
    
  3. Replace <CLUSTER_API_ENDPOINT> with your Shoreline cluster URL

    This is the starting point of your Shoreline Terraform configuration. You'll now add onto it in each step to build up a collection of Shoreline remediation objects.

Define a Variable

Terraform variable objects allow you to easily customize aspects of your Terraform configuration. Many Op Packs make heavy use of Terraform variables.

For now, add a new variable object to your main.tf file:

variable "prefix" {
  description = "Prefix to uniquefy objects."
  default     = "foo"
  type        = string
}

We'll use this value to prefix some of the generated objects so they're easily distinguishable throughout Shoreline.

Create a Metric

Making complex metric queries in Op is fairly simple, but it can be a pain to repeat the same piped statements over and over. This is where creating new named Metrics comes into play.

For example, consider the following Op statement:

op>
host | cpu_usage | window(60s) | mean(60)

 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 1  | HOST | i-0023d0079f68f135c | us-west-2 | us-west-2b | 2021/06/21 13:09:54 |     10.60
 2  | HOST | i-09f02e8f8c61b7ed1 | us-west-2 | us-west-2c | 2021/06/21 13:09:54 |     11.89
 3  | HOST | i-0e24bba6b126996a8 | us-west-2 | us-west-2c | 2021/06/21 13:09:54 |     11.94
 4  | HOST | i-0d3387a577ce97721 | us-west-2 | us-west-2b | 2021/06/21 13:09:54 |     10.92
 5  | HOST | i-05931c4a764484de3 | us-west-2 | us-west-2b | 2021/06/21 13:09:54 |     13.80
 6  | HOST | i-056c982a4438c424b | us-west-2 | us-west-2a | 2021/06/21 13:09:54 |     11.07
 7  | HOST | i-06aa45e1e35228685 | us-west-2 | us-west-2c | 2021/06/21 13:09:54 |     14.48
 8  | HOST | i-04044936c39eda37d | us-west-2 | us-west-2a | 2021/06/21 13:09:54 |     10.99
 9  | HOST | i-0a9da380d1e52cd8f | us-west-2 | us-west-2a | 2021/06/21 13:09:54 |     10.92
 10 | HOST | i-0e337ff0fc470d7fd | us-west-2 | us-west-2a | 2021/06/21 13:09:54 |     12.65

This returns the average CPU usage over the previous minute for all host Resources.

To create a named metric that achieves the same thing you can use the metric <name> = <value> syntax:

op>
metric avg_cpu_usage_prev_min = cpu_usage | window(60s) | mean(60)

However, an even easier solution is to create the Metric using the Shoreline Terraform Provider.

To create this same Metric in Terraform you define a shoreline_metric Terraform resource in the main.tf file:

resource "shoreline_metric" "avg_cpu_usage_prev_min" {
  name = "avg_cpu_usage_prev_min"
  value = "cpu_usage | window(60s) | mean(60)"
  description = "Average CPU usage within previous minute."
}

You can now directly reference this Metric within your Terraform configuration by its string-interpolated name, i.e. ${shoreline_metric.avg_cpu_usage_prev_min.name}.

Create an Alarm

Alarms trigger remediation Actions when a metric or custom status check fails. This is defined in the Alarm's fire_query property.

op>
alarm <alarm_name> = <fire_query>
You can use built-in or custom Metrics within your fire_query and clear_query statements. For example, this is how you might create an Alarm in Op that uses the avg_cpu_usage_prev_min Metric that you created above:
op>
alarm avg_cpu_usage_prev_min = avg_cpu_usage_prev_min >= 12.5

To create this same Alarm in Terraform define a shoreline_alarm Terraform resource in the main.tf file:

resource "shoreline_alarm" "high_cpu_alarm" {
  name = "${var.prefix}_high_cpu_alarm"
  fire_query = "${shoreline_metric.avg_cpu_usage_prev_min.name} >= 12.5"
  clear_query = "${shoreline_metric.avg_cpu_usage_prev_min.name} < 12.5"
  description = "Watch CPU usage."
  resource_query = "host"
  enabled = true
  resolve_short_template = "high_cpu_alarm resolved"
}

The fire_query property should match the metric query used in Op. Similarly, the clear_query should determine what status will resolve the Alarm. In this case, we merely check if the averaged CPU usage is less than 12.5% to consider the Alarm cleared.

Create an Action

The next step is to create an Action that executes when the Alarm is fired.

For this tutorial we'll avoid any potentially harmful code, but the same principles apply no matter what sort of command you're executing. For now, add the following shoreline_action Terraform resource definition to your main.tf file:

resource "shoreline_action" "ls_action" {
  name = "${var.prefix}_ls_action"
  command = "`ls /tmp`"
  description = "List /tmp files"
  resource_query = "host"
  start_title_template    = "`ls /tmp` action started"
  complete_title_template = "`ls /tmp` action completed"
  error_title_template    = "`ls /tmp` action failed"
  enabled = true
}

Here, we're merely listing the contents of the /tmp directory, but the command property can be any valid shell command.

Create a Bot

The next Shoreline object to create is a Bot, which ensures that the associated Action executes when the Alarm is fired.

Once again, add a new shoreline_bot Terraform resource to your main.tf file:

resource "shoreline_bot" "cpu_bot" {
  name = "cpu_bot"
  command = "if ${shoreline_alarm.high_cpu_alarm.name} then ${shoreline_action.ls_action.name} fi"
  description = "Act on CPU usage."
  enabled = true
}

The command property specifies the Alarm and Action that are connected by this Bot. It uses Terraform's built-in string interpolation to evaluate the name of both the Alarm and Action.

In this case, the shoreline_alarm Terraform resource called cpu_alarm determines when the shoreline_action Terraform resource called ls_action is executed.

Create a File Transfer

As part of this Op Pack we want to transfer the local books.json file to /tmp/books.json. This is accomplished using the shoreline_file Terraform resource:

resource "shoreline_file" "transfer_books_file" {
  name = "${var.prefix}_transfer_books_file"
  input_file = "${path.module}/books.json"
  destination_path = "/tmp/books.json"
  resource_query = "host"
  description = "Transfer `books.json` file."
  enabled = true
}

Create a Resource

The last step is to create a named Resource, similar to the named Metric generated in Create a Metric. Once again, we'll use the shoreline_resource Terraform resource block to define this named Resource.

resource "shoreline_resource" "books" {
  name = "${var.prefix}_books"
  description = "Pods with books app."
  value = "host | pod | app='bookstore'"
}

The value property defines the Op resource query -- in this case, all app Resources named bookstore. Instead of explicitly using host | pod | app='bookstore', you can now reference those Resources by the new foo_books name.

Create a Copyable File

For this bookstore Op Pack example we're copying the local books.json file to the /tmp directory of all host Resources.

Here is the sample content of our books.json file:

[
  {
    "title": "The Name of the Wind",
    "author": "Patrick Rothfuss",
    "pages": 722
  },
  { "title": "Fahrenheit 451", "author": "Ray Bradbury", "pages": 158 },
  { "title": "The Stand", "author": "Stephen King", "pages": 823 },
  { "title": "Moby Dick", "author": "Herman Melville", "pages": 378 }
]

That's it! You've now defined a complete remediation loop using the Shoreline Terraform Provider.

Initialize Terraform

With your Terraform configuration complete the next step is to initialize Terraform to download the Shoreline Terraform Provider.

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Finding shoreline.io/terraform/shoreline versions matching ">= 1.0.0"...
- Installing shoreline.io/terraform/shoreline v1.0.0...
- Installed shoreline.io/terraform/shoreline v1.0.0

Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

Create a Plan

Next, create a plan and review the Shoreline objects your Terraform configuration is creating.

$ terraform plan -out shoreline.tfplan

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
  + create

Terraform will perform the following actions:

  # shoreline_action.ls_action will be created
  + resource "shoreline_action" "ls_action" {
      + command                 = "`ls /tmp`"
      + complete_title_template = "`ls /tmp` action completed"
      + description             = "List /tmp files"
      + enabled                 = true
      + error_title_template    = "`ls /tmp` action failed"
      + id                      = (known after apply)
      + name                    = "foo_ls_action"
      + resource_query          = "host"
      + start_title_template    = "`ls /tmp` action started"
      + timeout                 = 60
      + type                    = (known after apply)
    }

  # shoreline_alarm.high_cpu_alarm will be created
  + resource "shoreline_alarm" "high_cpu_alarm" {
      + clear_query            = "avg_cpu_usage_prev_min < 7"
      + description            = "Watch CPU usage."
      + enabled                = true
      + family                 = "custom"
      + fire_query             = "avg_cpu_usage_prev_min >= 7"
      + id                     = (known after apply)
      + name                   = "foo_high_cpu_alarm"
      + resolve_short_template = "high_cpu_alarm resolved"
      + resource_query         = "host"
      + type                   = (known after apply)
    }

  # shoreline_bot.cpu_bot will be created
  + resource "shoreline_bot" "cpu_bot" {
      + command     = "if foo_high_cpu_alarm then foo_ls_action fi"
      + description = "Act on CPU usage."
      + enabled     = true
      + family      = "custom"
      + id          = (known after apply)
      + name        = "foo_cpu_bot"
      + type        = (known after apply)
    }

  # shoreline_file.transfer_books_file will be created
  + resource "shoreline_file" "transfer_books_file" {
      + checksum         = (known after apply)
      + description      = "Transfer `books.json` file."
      + destination_path = "/tmp/books.json"
      + enabled          = true
      + file_data        = (known after apply)
      + file_length      = (known after apply)
      + id               = (known after apply)
      + input_file       = "./books.json"
      + name             = "foo_transfer_books_file"
      + resource_query   = "host"
      + type             = (known after apply)
    }

  # shoreline_metric.avg_cpu_usage_prev_min will be created
  + resource "shoreline_metric" "avg_cpu_usage_prev_min" {
      + description = "Average CPU usage within previous minute."
      + id          = (known after apply)
      + name        = "avg_cpu_usage_prev_min"
      + type        = (known after apply)
      + value       = "cpu_usage | window(60s) | mean(60)"
    }

  # shoreline_resource.books will be created
  + resource "shoreline_resource" "books" {
      + description = "Pods with books app."
      + id          = (known after apply)
      + name        = "foo_books"
      + type        = (known after apply)
      + value       = "host | pod | app='bookstore'"
    }

Plan: 6 to add, 0 to change, 0 to destroy.

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Saved the plan to: shoreline.tfplan

To perform exactly these actions, run the following command to apply:
    terraform apply "shoreline.tfplan"

We've explicitly saved the plan to a local shoreline.tfplan file so it is locally saved and can be referenced during the next step.

Execute the Plan

$ terraform apply shoreline.tfplan

shoreline_action.ls_action: Creating...
shoreline_metric.avg_cpu_usage_prev_min: Creating...
shoreline_file.transfer_books_file: Creating...
shoreline_resource.books: Creating...
shoreline_resource.books: Creation complete after 2s [id=foo_books]
shoreline_metric.avg_cpu_usage_prev_min: Creation complete after 2s [id=avg_cpu_usage_prev_min]
shoreline_alarm.high_cpu_alarm: Creating...
shoreline_file.transfer_books_file: Creation complete after 4s [id=foo_transfer_books_file]
shoreline_action.ls_action: Creation complete after 4s [id=foo_ls_action]
shoreline_alarm.high_cpu_alarm: Creation complete after 3s [id=foo_high_cpu_alarm]
shoreline_bot.cpu_bot: Creating...
shoreline_bot.cpu_bot: Creation complete after 3s [id=foo_cpu_bot]

Apply complete! Resources: 6 added, 0 changed, 0 destroyed.

The Shoreline objects are now created and live!

Confirm Object Creation (Optional)

You can quickly verify that the Shoreline objects were created and are available in your cluster.

  1. Open an Op terminal

  2. Find Alarms named foo_high_cpu_alarm

    op>
    list alarms | name = "foo_high_cpu_alarm"
    TYPE  | NAME               | ENABLED | RESOURCE_QUERY | FIRE_QUERY
    ALARM | foo_high_cpu_alarm | 1       | host           | avg_cpu_usage_prev_min >= 7.
    
  3. Find Actions named foo_ls_action

    op>
    list actions | name = "foo_ls_action"
    TYPE   | NAME          | ENABLED | RESOURCE_QUERY | PARAMS | COMMAND   | TIMEOUT
    ACTION | foo_ls_action | 1       | host           | []     | `ls /tmp` | 60
    
  4. Find Bots named foo_cpu_bot

    op>
    list bots | name = "foo_cpu_bot"
    TYPE | NAME        | ENABLED | ACTION_STATEMENT | ALARM_STATEMENT
    BOT  | foo_cpu_bot | 1       | foo_ls_action    | foo_high_cpu_alarm
    
  5. (Optional) Check if the foo_high_cpu_alarm Alarm fired using Events

    op>
    events | type="alarm" | name = "foo_high_cpu_alarm"
    STEP_TYPENAMETIMESTAMPTITLE
    ALARM_FIREfoo_high_cpu_alarm20:50:44 2021/06/21fired foo_high_cpu_alarm
    BOT_DISPATCHfoo_cpu_bot20:50:45 2021/06/21dispatched foo_cpu_bot
    ACTION_STARTfoo_ls_action20:50:45 2021/06/21ls /tmp action started
    BOT_ENDfoo_cpu_bot20:50:45 2021/06/21completed foo_cpu_bot

    In this case, the foo_high_cpu_alarm fired, the foo_cpu_bot was dispatched, the foo_ls_action was executed, and the foo_cpu_bot halted.

Destroy Shoreline Objects

The final step is to remove the created Shoreline objects using Terraform's destroy command.

$ terraform destroy

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  # shoreline_action.ls_action will be destroyed
  - resource "shoreline_action" "ls_action" {
      - command                 = "`ls /tmp`" -> null
      - complete_title_template = "`ls /tmp` action completed" -> null
      - description             = "List /tmp files" -> null
      - enabled                 = true -> null
      - error_title_template    = "`ls /tmp` action failed" -> null
      - id                      = "foo_ls_action" -> null
      - name                    = "foo_ls_action" -> null
      - params                  = [] -> null
      - resource_query          = "host" -> null
      - start_title_template    = "`ls /tmp` action started" -> null
      - timeout                 = 60 -> null
      - type                    = "ACTION" -> null
    }

  # shoreline_alarm.high_cpu_alarm will be destroyed
  - resource "shoreline_alarm" "high_cpu_alarm" {
      - clear_query            = "avg_cpu_usage_prev_min < 7" -> null
      - description            = "Watch CPU usage." -> null
      - enabled                = true -> null
      - family                 = "custom" -> null
      - fire_query             = "avg_cpu_usage_prev_min >= 7" -> null
      - fire_title_template    = "fired foo_high_cpu_alarm" -> null
      - id                     = "foo_high_cpu_alarm" -> null
      - name                   = "foo_high_cpu_alarm" -> null
      - resolve_short_template = "high_cpu_alarm resolved" -> null
      - resolve_title_template = "cleared foo_high_cpu_alarm" -> null
      - resource_query         = "host" -> null
      - type                   = "ALARM" -> null
    }

  # shoreline_bot.cpu_bot will be destroyed
  - resource "shoreline_bot" "cpu_bot" {
      - command     = "if foo_high_cpu_alarm then foo_ls_action fi" -> null
      - description = "Act on CPU usage." -> null
      - enabled     = true -> null
      - family      = "custom" -> null
      - id          = "foo_cpu_bot" -> null
      - name        = "foo_cpu_bot" -> null
      - type        = "BOT" -> null
    }

  # shoreline_file.transfer_books_file will be destroyed
  - resource "shoreline_file" "transfer_books_file" {
      - checksum         = "e3302eff9a2d4d0bddef67f36d5095aa" -> null
      - description      = "Transfer `books.json` file." -> null
      - destination_path = "/tmp/books.json" -> null
      - enabled          = true -> null
      - file_data        = "KLUv/UQAQgDdBQAiyyUjQIk4AciultK4ggo3SEP9cBMLUK9zXFnMbRt3rpg0R8BwgQmLKSwW+eMw61xbcBv9qSMuQUPRBF4lYMQsD+O1gr/lx/ZBetqHLyOeRkmyilRC0hjwSg1JKg3Auy2/jawDkMi/2LI4h3Jahj6miVpwQ8MbihZbD/JQRshNaZLSawhf7i/g16EGX0bkoVzXrfP/F4P8iyUEDAAQw1UgS5A44CgSsmwiuVBsShZ8yWCXV3dpVBlMmdIDB3W6+Q==" -> null
      - file_length      = 322 -> null
      - id               = "foo_transfer_books_file" -> null
      - input_file       = "./books.json" -> null
      - name             = "foo_transfer_books_file" -> null
      - resource_query   = "host" -> null
      - type             = "FILE" -> null
    }

  # shoreline_metric.avg_cpu_usage_prev_min will be destroyed
  - resource "shoreline_metric" "avg_cpu_usage_prev_min" {
      - description = "Average CPU usage within previous minute." -> null
      - id          = "avg_cpu_usage_prev_min" -> null
      - name        = "avg_cpu_usage_prev_min" -> null
      - type        = "METRIC" -> null
      - value       = "cpu_usage | window ( 60s ) | mean ( 60 )" -> null
    }

  # shoreline_resource.books will be destroyed
  - resource "shoreline_resource" "books" {
      - description = "Pods with books app." -> null
      - id          = "foo_books" -> null
      - name        = "foo_books" -> null
      - type        = "RESOURCE" -> null
      - value       = "host | pod | app = 'bookstore'" -> null
    }

Plan: 0 to add, 0 to change, 6 to destroy.

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

shoreline_file.transfer_books_file: Destroying... [id=foo_transfer_books_file]
shoreline_bot.cpu_bot: Destroying... [id=foo_cpu_bot]
shoreline_resource.books: Destroying... [id=foo_books]
shoreline_resource.books: Destruction complete after 0s
shoreline_file.transfer_books_file: Destruction complete after 0s
shoreline_bot.cpu_bot: Destruction complete after 0s
shoreline_action.ls_action: Destroying... [id=foo_ls_action]
shoreline_alarm.high_cpu_alarm: Destroying... [id=foo_high_cpu_alarm]
shoreline_alarm.high_cpu_alarm: Destruction complete after 1s
shoreline_metric.avg_cpu_usage_prev_min: Destroying... [id=avg_cpu_usage_prev_min]
shoreline_action.ls_action: Destruction complete after 1s
shoreline_metric.avg_cpu_usage_prev_min: Destruction complete after 0s

Destroy complete! Resources: 6 destroyed.

With that, all Terraform-generated Shoreline objects are removed and your environment is back to square one.