Transform Functions

When monitoring and trouble-shooting, we typically work with data that is focused around a specific period of time.  We look for indicators and patterns that give us clues as to what potentially led up to and caused an incident.  The Op pipe operator (similar to the commonly used pipe operator from the Bash shell) allows us to filter, aggregate, and transform the results of our resource and metric queries.

For example:

op>
host | metric_query (...) | sum(30)
returns values from a window size of 30 data points and calculates the sum.
op>
cpu_usage | real_time=true | window(30s)
gets cpu usage from the last 30 seconds

When parameters are piped to metric and resource queries, they are viewed as functions; hence, we refer to them as transform functions.  Op has a very robust set of transform functions described in the following sections.

It’s important to note that when you apply transform functions to a metric query/defined metric interleaved with arithmetic/comparison, you must use parentheses to make it clear what the operands of the arithmetic expression are and what the transform functions are applying to.

For example, instead of:

  • metric_query(...) | sum(1) | window(2s) + metric_query(...) | sum(3) | window(4s)

You would use parentheses to group the operands to the + operator and make it explicit what the transform functions are applying to and use this:

  • (metric_query(...) | sum(1) | window(2s)) + (metric_query(...) | sum(3) | window(4s))

So, if you wanted to then apply a transform function, like mean(5) to the result of the entire arithmetic expression, you would wrap the entire arithmetic expression in parentheses and apply the mean(5) like this:

  • ((metric_query(...) | sum(1) | window(2s)) + (metric_query(...) | sum(3) | window(4s))) | mean(5)

If you wanted to break the expression into smaller bite size chunks with fewer parentheses, you could create defined metrics for incremental components of the computation and glue them together, i.e define the following in the below order:

  • metric m1 = metric_query(...) | sum(1) | window(2s)
  • metric m2 = metric_query(...) | sum(3) | window(4s)
  • metric m3 = m1 + m2
  • metric m4 = m3 | mean(4)

Time Aggregates

The time aggregate transform functions take 3 parameters:

  • aggregation_window_size
  • window_mode (optional)
  • drop_incomplete (optional)

Aggregation Window Size: Each of these functions takes an aggregation window size (number of data points) as an argument. For example, if timestamps are [0, 1000, 2000, 3000, 4000, 5000] and values are [0, 1, 2, 3, 4, 5] then the sum aggregate with aggregation window size 3 gives timestamps [2000, 5000] and values [3, 12] ([0 + 1 + 2, 3 + 4 + 5]). A user would invoke this function by writing host | metric_query(...) | sum(3) to aggregate with aggregation function sum and aggregation window size 3.

Window Mode: Either “SLIDING” or FIXED”. The default is “SLIDING” if not specified. Setting window_mode="FIXED" results in the aggregation function being computed over non-overlapping blocks of data, each with aggregation_window_size data points. For example, sum(5, "FIXED") computes the sum over data points 1-5, 6-10, 11-15...etc. If window_mode="SLIDING", the aggregation function is computed over overlapping blocks of data, each with aggregation_window_size_data_points. For example, sum(5, "SLIDING") computes the sum over data points 1-5, 2-6, 3-7...etc.

Drop Incomplete: Either true or false. The default is false if not specified. Consider what happens if there are 10 data points in a series, and a sum time aggregate is applied to this series with an aggregation window size of 3 and a "FIXED" window mode. The last aggregation bucket will only contain a single data point, because 10/3 yields a remainder of 1. If drop_incomplete=true, then this last partially filled aggregation bucket will be dropped, and the aggregated series will only contain 3 data points (the sums over the first three buckets). If drop_incomplete=false, the aggregated series will contain 4 data points, because the sum will be computed over the first 3 full buckets along with the last partially filled bucket.

Here is example syntax for a sum time aggregate with all 3 parameters specified:
sum(aggregation_window_size=5, window_mode="FIXED", drop_incomplete=true). You can specify the parameters without their names, but they must be in the following order: [aggregation_window_size, window_mode, drop_incomplete]. Example syntax with all 3 parameters specified without name: sum(5, "FIXED", true).

count

count computes the number of values across the given aggregation window.

op>
host | count
 RESOURCE_COUNT
 3

irate

irate computes the per second rate of change between the values at the beginning and end of the aggregation window.

op>
host | cpu_usage | irate(5)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:43:09 |      0.94
    |      |                     |           |            | 2021/07/12 11:43:10 |     -0.38
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:43:09 |     -0.63
    |      |                     |           |            | 2021/07/12 11:43:10 |      0.19
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:43:09 |     -1.38
    |      |                     |           |            | 2021/07/12 11:43:10 |      0.87

max

max computes the maximum values across the given aggregation window.

op>
host | cpu_usage | max(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:48:00 |     20.50
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:48:00 |     11.50
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:48:00 |     94.25

mean

mean computes the mean (average) values across the given aggregation window.

op>
host | cpu_usage | mean(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:48:12 |      7.06
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:48:12 |      7.38
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:48:12 |     85.81

min

min computes the minimum values across the given aggregation window.

op>
host | cpu_usage | min(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:48:26 |      5.75
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:48:26 |      6.50
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:48:26 |     81.75

p1

p1 computes the 1st percentile of values across the given aggregation window.

op>
host | cpu_usage | p1(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:48:36 |      6.81
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:48:36 |      6.32
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:48:36 |     83.26

p10

p10 computes the 10th percentile of values across the given aggregation window.

op>
host | cpu_usage | p10(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:48:46 |      8.65
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:48:46 |      6.62
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:48:46 |     81.65

p90

p90 computes the 90th percentile of values across the given aggregation window.

op>
host | cpu_usage | p90(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:48:56 |     11.45
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:48:56 |      8.50
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:48:56 |     85.87

p99

p99 computes the 99th percentile of values across the given aggregation window.

op>
host | cpu_usage | p99(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:49:05 |     11.19
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:49:05 |     10.49
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:49:05 |     87.75

stddev

stddev computes the standard deviation of values across the given aggregation window.

op>
host | cpu_usage | stddev(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:49:14 |      1.68
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:49:14 |      2.79
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:49:14 |      1.93

sum

sum computes the sum of values across the given aggregation window.

op>
host | cpu_usage | sum(600)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 11:49:26 |     51.50
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 11:49:26 |     29.75
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 11:49:26 |    344.25

Transforms

These functions change the data returned by a Metric query.

ceil

ceil computes the ceiling of each value in the time series.

op>
host | cpu_usage | ceil
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:18:20 |     10.00
    |      |                     |           |            | 2021/07/12 12:18:21 |     10.00
    |      |                     |           |            | 2021/07/12 12:18:22 |      6.00
    |      |                     |           |            | 2021/07/12 12:18:23 |     12.00
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 12:18:20 |     10.00
    |      |                     |           |            | 2021/07/12 12:18:21 |      7.00
    |      |                     |           |            | 2021/07/12 12:18:22 |     14.00
    |      |                     |           |            | 2021/07/12 12:18:23 |     11.00
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 12:18:20 |     88.00
    |      |                     |           |            | 2021/07/12 12:18:21 |     84.00
    |      |                     |           |            | 2021/07/12 12:18:22 |     87.00
    |      |                     |           |            | 2021/07/12 12:18:23 |     90.00

floor

floor computes the floor of each value in the time series.

op>
host | cpu_usage | floor
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:18:33 |      8.00
    |      |                     |           |            | 2021/07/12 12:18:34 |     11.00
    |      |                     |           |            | 2021/07/12 12:18:35 |      7.00
    |      |                     |           |            | 2021/07/12 12:18:36 |      7.00
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 12:18:33 |      9.00
    |      |                     |           |            | 2021/07/12 12:18:34 |     11.00
    |      |                     |           |            | 2021/07/12 12:18:35 |      9.00
    |      |                     |           |            | 2021/07/12 12:18:36 |      7.00
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 12:18:33 |     84.00
    |      |                     |           |            | 2021/07/12 12:18:34 |     85.00
    |      |                     |           |            | 2021/07/12 12:18:35 |     88.00
    |      |                     |           |            | 2021/07/12 12:18:36 |     88.00

limit

limit takes a number as an argument and limits all the time series in the result to have a number of points less than or equal to this number.

op>
host | cpu_usage | limit(1)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:18:53 |     12.00
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 12:18:53 |     11.50
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 12:18:53 |     86.00

lower_bound

lower_bound computes the minimum of the specified argument and each value in the time series.

op>
host | cpu_usage | lower_bound(90.0)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:19:02 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:03 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:04 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:05 |     90.00
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 12:19:02 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:03 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:04 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:05 |     90.00
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 12:19:02 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:03 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:04 |     90.00
    |      |                     |           |            | 2021/07/12 12:19:05 |     90.00

shift

shift shifts the returned time series by the argument amount.

For example, consider the following series of four CPU usage values of a host Resource.

op>
host | limit=1 | cpu_usage
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:22:16 |      5.00
    |      |                     |           |            | 2021/07/12 12:22:17 |     12.50
    |      |                     |           |            | 2021/07/12 12:22:18 |     10.00
    |      |                     |           |            | 2021/07/12 12:22:19 |      7.50

These data points are represented by a timestamp array and each value is mapped to that timestamp, e.g.:

[
  {
    "timestamp": "2021/07/12 12:22:16",
    "values": [5.0]
  },
  {
    "timestamp": "2021/07/12 12:22:17",
    "values": [12.5]
  },
  {
    "timestamp": "2021/07/12 12:22:18",
    "values": [10.0]
  },
  {
    "timestamp": "2021/07/12 12:22:19",
    "values": [7.5]
  }
]

Applying a shift of 2 removes the first two timestamped elements of the array, leaving the latter two elements in the result set.

op>
host | limit=1 | cpu_usage | shift(-2)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:22:18 |     10.00
    |      |                     |           |            | 2021/07/12 12:22:19 |      7.50

Conversely, a shift value of -2 removes the last two elements of the array, leaving the first two elements in the result set.

op>
host | limit=1 | cpu_usage | shift(2)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:22:16 |      5.00
    |      |                     |           |            | 2021/07/12 12:22:17 |     12.50

To better illustrate, consider the following Metric query.

op>
host | limit=1 | cpu_usage | shift(-2) | cpu_usage
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE | CPU_USAGE_2
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:22:16 |     10.00 |        5.00
    |      |                     |           |            | 2021/07/12 12:22:17 |      7.25 |       12.50
    |      |                     |           |            | 2021/07/12 12:22:18 |       --- |       10.00
    |      |                     |           |            | 2021/07/12 12:22:19 |       --- |        7.25

Here, we're getting four CPU_USAGE results, shifting the last two timestamps from the result set, and then following that with another non-shifted cpu_usage Metric query. The result shows how the last two values were shifted to the first two timestamp elements.

upper_bound

upper_bound computes the maximum of the specified argument and each value in the time series.

op>
host | cpu_usage | upper_bound(10.0)
 ID | TYPE | NAME                | REGION    | AZ         | TIMESTAMPS          | CPU_USAGE
 91 | HOST | i-09a8368311b6eba52 | us-west-2 | us-west-2b | 2021/07/12 12:19:13 |      9.50
    |      |                     |           |            | 2021/07/12 12:19:14 |      6.00
    |      |                     |           |            | 2021/07/12 12:19:15 |      8.00
    |      |                     |           |            | 2021/07/12 12:19:16 |      7.50
 92 | HOST | i-0d0926592163c0318 | us-west-2 | us-west-2a | 2021/07/12 12:19:13 |      9.75
    |      |                     |           |            | 2021/07/12 12:19:14 |      6.00
    |      |                     |           |            | 2021/07/12 12:19:15 |      7.75
    |      |                     |           |            | 2021/07/12 12:19:16 |      9.00
 93 | HOST | i-0bab1fbd533955203 | us-west-2 | us-west-2c | 2021/07/12 12:19:13 |     10.00
    |      |                     |           |            | 2021/07/12 12:19:14 |     10.00
    |      |                     |           |            | 2021/07/12 12:19:15 |     10.00
    |      |                     |           |            | 2021/07/12 12:19:16 |     10.00

Parameters

Metric Query Parameter NameFunctionExampleDescription
window(left end parameter[, right end parameter])Window function specifies the range of time (left end to right end) for the Op statement. If the optional right range parameter is not specified, the right end of the time range defaults to now.cpu_usage \| window(30s)gets cpu_usage from 30 seconds ago to now
from, tolimits the metric query from time from to time tocpu_usage \| from=1000 \| to=3000gets cpu_usage from time 1000 to time 3000
base, offset"If offset is positive, limit the query to cover the interval (base, base + offset) and if offset is negative, limit the query to cover the interval (base + offset, base)"cpu_usage \| base=3000 \| offset=2000gets cpu_usage from time 3000 to time 3000 + 2000 = 5000
resolutionsets the resolution of the data queried for, allowed resolutions are 1 sec, 10 seconds, 1 min and 1 hour.cpu_usage \| resolution=60gets data at one minute resolution

Resource Aggregates

These functions aggregate metric query results across resources. For example, to get the sum of cpu_usage across all hosts, you would write host | cpu_usage | r_sum.

Metric Query Aggregate NameFunctionExample
r_sumaggregates the metric query across resources by taking the sum of the values across each resource for each timestampcpu_usage \| r_sum
r_countaggregates the metric query across resources by counting the number of values across each resource for each timestampcpu_usage \| r_count
r_meanaggregates the metric query across resources by taking the mean of the value for each resource for each timestampcpu_usage \| r_mean
r_maxaggregates the metric query across resources by taking the max of the value for each resource for each timestampcpu_usage \| r_max
r_minaggregates the metric query across resources by taking the min of the value for each resource for each timestampcpu_usage \| r_min
r_stddevaggregates the metric query across resources by taking the standard deviation of the value for each resource for each timestampcpu_usage \| r_stddev
r_p1aggregates the metric query across resources by taking the 1st percentile of the value for each resource for each timestampcpu_usage \| r_p1
r_p10aggregates the metric query across resources by taking the 10th percentile of the value for each resource for each timestampcpu_usage \| r_p10
r_p90aggregates the metric query across resources by taking the 90th percentile of the value for each resource for each timestampcpu_usage \| r_p90
r_p99aggregates the metric query across resources by taking the 99th percentile of the value for each resource for each timestampcpu_usage \| r_p99