Opal Verb of the Month: Filter

By Andrea Longo, May 2, 2022

This month we’re excited to release our first post in the “OPAL Verb of the Month” series, where we take users on an in-depth journey into some of the most useful verbs and functions in the OPAL language.

For our first post let’s talk about one of the most fundamental OPAL verbs, filter. The filter verb is found in virtually every OPAL script and is often the first verb users master as it’s imperative to help sort through volumes of unneeded data when troubleshooting or modeling data.

Note: For those not familiar with OPAL, this post is a great place to start.

What does filter do?

Simply put, filter returns a set of results matching a condition, which is often the first thing an OPAL script does, as it’s more efficient to narrow your data of interest upfront than wait until the end. Usually, you use its results (a temporary dataset, really) as the input dataset for additional operations.

A filter operation is always accelerable, meaning this verb behaves the same no matter the size of the query time window it’s applied to. (You may also see this referred to as “streamable.”) As long as the other statements in your OPAL script are also accelerable, datasets created from its output can be accelerated, making them faster to perform queries on.

How do I use it?

The filter verb can only accept one argument, which is an expression defining what to match. That expression may be one of these types:

  • Boolean
  • Space-delimited string
  • Regular expression

Boolean filter expressions

For the first case, give filter any kind of expression that evaluates to True if it matches the desired field value, or False if it does not.

Example: match all weather observations where the city (a field in the data) matches the string Toronto.

OPAL Filter example

Comparisons with a string literal, a number, or the result of an OPAL function are all fair game:

We don’t have space to cover OPAL functions in this post but see the list of boolean functions in the OPAL docs for details. Some useful ones are:

Space-delimited strings: full-text search and “searchable text”

Use the <> operator to do case-insensitive full-text matching. This matches text in any “searchable text,” which means all fields defined as the type string. (JSON is special, we’ll get to that in a moment.)

Full-text search terms may be single words or quoted phrases. By default, multiple terms are and conditions: a result matches only if it contains all of them. Use or to match one (or more) of them instead. Here are some examples:

If you know which field contains the value of interest, use the ~ operator to limit full-text search to only that field:

Regular expressions

You can use the OPAL function match_regex() to craft a boolean expression, as described above. But as a bonus, the ~ syntax also supports regular expressions:

Fields containing JSON

The contents of JSON fields aren’t “searchable text” for the <> operator. To provide JSON-aware options in the UI, Observe recognizes JSON data as distinct from plain strings. So matching values inside a JSON object is a little different. Try one of these options instead:

And many more…

As with many languages, there are a variety of ways to get where you want to be. The examples below summarize a range of common match options:

For best results, pay attention to how broad your search is: filter <clear> matches the description  “clear sky”, but also the city name “Clearwater”. If you are only interested in descriptions, use a method that allows you to specify a field name.

Learn more

These examples are meant to be simply a starting place. For many more, see the docs for filter as well as the OPAL Examples page. Also, remember that for every action you take in the UI, OPAL is generated and is visible in the OPAL console — which can be an incredibly useful tool for learning OPAL that matters to you.