Working with JSON in OPAL

By Andrea Longo, November 21, 2022


JSON is a human-readable data interchange format. It has its origins in the JavaScript programming language, as JavaScript Object Notation, but is now one of the most common ways to pass data between services. Whether it contains structured logging generated by an application or an API request payload in a standard text logline, this is valuable information for observability.

If you are accustomed to JSON in plain text files, it appears as a collection of attributes and their respective values serialized as one gigantic string. You could use the extract_regex() verb, but parsing complex text with regular expressions is complicated. Observe knows about JSON, as in the UI you can simply Extract from JSON in a worksheet. But how does this work in OPAL?

Extracting individual values

In Observe, JSON data typically starts as an object (a collection of key/value pairs) in a FIELDS column. It may have been ingested via HTTP POST with a content type of application/json, or another method that uses the Observe HTTP endpoint under the hood.

Raw JSON data in Observe

Let’s extract individual values from that object using dot notation:

These values appear to be entirely normal strings and numbers, right? But if you inspect the columns, they are actually of type any. What is that?

Data column type any in Observe

Type any: an OPAL bag of holding

In Observe, any corresponds to the Snowflake type VARIANT. Columns of type any may contain different types of values; not only strings or numbers but also arrays or other objects. As you extract individual values, you can then cast them to the type needed for your purpose.

This Schrödinger-like data type arrangement means you don’t have to worry about the data type until you need it. In the context of JSON in OPAL, it also means you don’t have to deal with everything as a string parsed with regular expressions. 

The following example creates a metric by converting the value to type float64:

JSON mixed with other data

Sometimes the JSON you want is embedded in other text, such as a request payload logged to a file. If what you need from it is small and well-defined, one of the OPAL string functions may do the job. But for anything more complex, it’s better to extract and parse the JSON.

To get at it, first, pull out the JSON portion with extract_regex() and then convert the resulting string with parse_json(). The regular expression needed is fairly simple, and afterward, you can get to the desired value with dot notation:

This version parses for text containing data as a key/value pair:

Wrapping Up

In summation, here are a few things to keep in mind when working with JSON in Observe.

  • JSON objects, no matter the number of attributes they may contain, are ingested as type object in Observe.
  • A value extracted from a JSON object is initially the type any. After the fact, you can cast it to the needed type with the appropriate OPAL function.
  • Avoid working with JSON as a string, except as an intermediate step when converting values to more readable forms.

Learn More

Hopefully, this guide has bolstered your confidence when it comes to working with JSON in Observe, but as always we can’t cover everything in a single post. For everything else, head on over to our docs to discover more verbs, and dozens of examples of how to parse, extract, and model JSON data.

If you’d like to know more about OPAL in general, or are stuck and need a bit of help, visit our Slack support channel (#opal) and one of our engineers will be happy to help!