How Observe Uses Snowflake to Deliver the Observability Cloud [Part 2]
Observe benefits from Snowflake’s efficient, compressed columnar data storage. It is not uncommon to see 10x compression levels from the original incoming data size once it comes to rest in Observe.
Part 2: Data Modeling and Transformation
The Observability Cloud takes input from various data sources via open endpoints and agents and stores it in the Data Lake to unify telemetry. From there it builds an interactive, extensible, and connected map of your data called the Data Graph that allows you opportunities to explore and monitor the digital connections of your entire organization.
This series will look at how Snowflake is key to The Observability Cloud’s unique architecture and allows it to deliver an order-of-magnitude improvement in both the speed of troubleshooting and economics of deployment.
This 3-part blog series includes:
- In Part 1, we reviewed how Observe accepts and processes customer data
- In this post, we look at how we shape that data into a useful form and “link” it to other data
- In Part 3, we will review how Observe manages resources with a focus on resilience, quality, and cost
Data Modeling and Transformation with Observe and Snowflake
Once data is in Observe, it can be turned into a Data Graph, which consists of Datasets curated from the Data Lake to make it easier to navigate and faster to query. Datasets represent “things” that users want to ask questions about. They can be business-related, such as customers and shopping carts, or infrastructure-related, such as pods, containers, and S3 buckets. This is covered by the components on the left side of the diagram below (Front End, API Server, and Transformer):
Data Transformation with OPAL Dataflow Language
To model data sent to Observe, we created a language called “Observe Programming and Analytics Language”, or OPAL for short. OPAL is a declarative dataflow programming language inspired by Bash and Splunk DSLs and is designed to be much easier to write than SQL.
An OPAL query is a pipeline of verbs such as
filter (used to filter Datasets),
make_col (used to create a new column),
align ( to aggregate nearby metrics into a time grid), and so on. The data flows from one verb to the next — much as you’d find in a typical scripting language like Bash.
Here is an example of an OPAL snippet at the bottom of the Worksheet:
Snowflake speaks SQL, not OPAL, so Observe compiles OPAL into SQL.
Each OPAL verb has precise temporal-relational semantics. That pipeline of verbs is translated to plain SQL through a series of intermediate representations, where each step is subjected to a variety of algebraic rewrites and optimizations. The result is a highly optimized and compacted SQL query that uses standard SQL statements and features (Common Table Expressions; DML functions like
DELETE; aggregate functions like
LISTAGG; window functions like
Modeling Observations into Curated Temporal Datasets
To better explain how modeling works in Observe, let’s consider a user’s session on an e-commerce website. It has a lifecycle, with distinct start and end times, as well as events that occur during the session, such as reviewing the available products, adding them to a shopping cart, and checking out. Each of those events within the user’s session has a timestamp, and many of them modify the state of the session. To observe (pun intended) the state of a user’s session, it is important to know both the lifetime of the session and the state of the session over time.
Observe treats time as a first-class citizen. Combining concepts familiar from classic relational databases (tables, primary and foreign keys, relationships) with the temporal aspect of incoming data (timestamps), Observe’s modeling process converts raw observations into a temporal relational data model that everything else is built from.
The modeling process begins by looking at the data and figuring out the information that needs to be extracted from it to make useful business decisions. Observe makes it easy to both explore incoming raw observations for patterns and to model the resulting “Datasets” containing useful information.
Datasets are much like tables in a database as they consist of rows and columns of specific types. They can be linked, joined, and aggregated to derive insights. Datasets automatically grow as Observe collects new data.
Datasets can also be used as input to other Datasets. In the example below, the raw OpenTelemetry observations are sent from a Kubernetes-hosted application. The Observe OpenTelemetry App provides the definition and maintenance of the Span Dataset, which is then used as a source to further define the Operation and Service Datasets.
The results of the modeling exercise are a graph of Datasets, which we call the “Data Graph”, through which data flows and is successively transformed or, simply put a data pipeline. The updates flowing through the data pipeline are governed by dynamic task scheduling based on data volume, access patterns, and cost constraints.
Observe provides over a dozen Data Apps that provide pre-built Datasets and data pipelines for well-known sources of data (such as AWS, GCP, Kubernetes, GitHub, MySQL, and so on). Those Datasets can then be linked easily to Datasets from other applications, services, and infrastructure to help you easily monitor and troubleshoot your entire environment.
Types and Purposes of Datasets
Observe defines four different types of Datasets:
- Event Datasets: A time-stamped occurrence of something at a distinct point in time (e.g. user login event at 09:18 or, user added a product to their shopping cart at 17:03)
- Resource Datasets: An entity with a tracked temporal state (e.g. servers, user sessions, and even more abstract concepts like customers)
- Interval Dataset: A time span of something with clear begin and end times (e.g. user session between 09:00 and 17:00)
- Table Datasets: Non-temporal data (e.g. list of states or a list of car manufacturers)
Event Datasets represent an occurrence of something at a particular point in time. All events in an Event Dataset share a common schema — including a list of strongly typed columns. Columns can have semi-structured data types (like JSON) that provide native support for object and array types.
Here is an example of an Event Dataset for Kubernetes Events (Event Dataset icons are color-coded purple in the UI):
Resource Datasets represent an entity or a thing (e.g. a Kubernetes pod) with tracked changes of all properties of that thing.
Each Resource has a primary key, which is a set of one or more fields that uniquely identify each Resource instance (e.g. cluster_uid, namespace, or pod_name), that does not change over the lifetime of a Resource.
Just like with any respectable database, changing the primary key means it becomes a different Resource. If your Kubernetes cluster kills a pod and starts a new one, there will be a new Resource created for that pod in addition to the old one — now marked as dead.
Each Resource instance lives over a span of time, and some columns have values that may vary across that time — like pod status (e.g. Initializing, Running, etc.). Observe tracks those field changes and keeps a full history of them.
Resource Datasets are “temporal tables” that allow users to audit the past state of the system, perform temporal joins, and drill down to the original events, giving rise to the state changes that are being investigated.
Here is an example of a Resource Dataset for Kubernetes Pod, together with metrics associated with that resource (Resource Dataset icons are color-coded blue in the UI):
Interval Datasets are similar to Event datasets where each row is valid for some interval of time instead of a point in time. They have a valid_from column (start time) and a valid_to column (end time). This means that each row in an Interval Dataset is associated with an interval of time.
Interval datasets often arise from aggregation queries (for example, give me the number of packets dropped per one-minute bucket).
Here is an example of an Interval Dataset for OpenTelemetry Span:
Table Datasets are simply tables without a time dimension. They behave exactly like interval datasets where all intervals are infinite. They are useful for creating maps of relatively static data that can then be referenced from another Dataset.
Implementation of Datasets in Snowflake Tables
So how exactly are the Observe Datasets implemented in a Snowflake database repository?
Event Datasets – Point Tables
The tables containing Event Datasets are “point” tables where each row is a fact (event or measurement) at some point in time.
In data warehousing concepts, point tables are often “fact” tables.
Resource and Interval Datasets – Interval Tables
The tables containing Resource and Interval Datasets are “interval” tables, where a row is a fact that was true for some interval of time.
Here is an example of a point table on the left containing an Event Dataset (timestamp marks an event, pod_id, pod_name, and pod_status are fields). This table is used to produce an interval table on the right containing the Resource Dataset (where pod_id acts as a key and valid_from and valid_to define the validity of this temporal record).
The interval tables are the key mechanism that Observe uses to turn event streams into a history of states of things. They are often much smaller than the original point tables because input events may have no aggregate effect on the final interval table. For example, if your Kubernetes pod’s status hasn’t changed from when it first posted a “running” status until now, you won’t see any updates.
The interval table can efficiently answer “What is the state of resource R at time X?” queries that are part of any troubleshooting or discovery session.
Using data warehousing concepts, interval tables for Resource Datasets can be thought of as “Slowly Changing Dimension Type-2 (i.e. add a new row)” tables.
Making Datasets Faster to Query with Acceleration
All Datasets, whether created by Observe Apps or customers themselves, can be “accelerated.” This simply means that the Datasets are materialized from the raw data and then managed as tables in Snowflake for the duration of their existence.
Observe does not require the entirety of the table(s) to be accelerated as Observe strives to only accelerate the data that is likely to be queried. Recent data is accelerated as it is much more likely to be queried than old data.
By default, a newly created Dataset is accelerated for 8 previous days (if data is available), but that can be adjusted depending on an organization’s specific needs. After Dataset is accelerated, all incoming new data is also accelerated, with the default time window of the accelerated dataset being 2 months.
It’s also possible to accelerate a window in a more distant path, like a 12-hour-long incident that occurred 6 months ago, as accelerated periods don’t have to be contiguous.
Cost-Effective Acceleration Scheduling
Aggressively accelerating data, or accelerating data that is never queried is pointless. To counteract this, Observe adapts the materialization schedule to the access pattern, data volume, freshness goals, and cost constraints. For example, if an accelerated dataset isn’t used, its acceleration will be stopped, to be resumed should it be accessed at a later point.
As a result, Observe computational and resource usage is generally much lower and freshness is better than if users had to define CRON jobs manually and chain together complex schedules for various tables.
Maintenance of Datasets with Snowflake
Resource Datasets stored in interval tables can be difficult to materialize and accelerate incrementally. New input rows from the Event Dataset point tables can arrive out of order, causing updates to old interval rows, or even requiring that a single record be broken up into multiple records.
To counteract this, Observe periodically takes the next batch of input data and processes it via a Snowflake
MERGE statement to update the database table(s) incrementally that represent the Dataset. The
MERGE statement conditionally inserts new, or updates, existing rows representing Resource records. Those are executed as atomic database transactions, so concurrent queries always see Datasets in a consistent state.
Here is an example illustrating an update of a Resource Dataset table with new Event data:
Snowflake stores table rows in micro-partitions, where each micro-partition contains between 50 MB and 500 MB of uncompressed data in Snowflake’s proprietary columnar format. Each micro-partition is optimized to be approximately 16MB in its compressed state. Micro-partitions are stored in Amazon S3 as immutable files, so an update of even a single row in a micro-partition requires a rewrite of the entire micro-partition file with those updates.
If many rows are updated during a
MERGE operation and they span many micro-partitions of the underlying interval tables, the process of rewriting the micro-partitions takes longer, consumes more compute resources, and is, therefore, more costly.
To maintain consistent query performance and optimize query costs, all Dataset tables managed by Observe are clustered — i.e. sorted and ordered along dimensions useful for subsequent queries. The clustering for most tables is typically by time, but we do use more sophisticated clustering methods for metrics Datasets and auxiliary tables used for full-text search. Observe uses both Snowflake table clustering (
CLUSTER BY) and manual clustering (
ORDER BY) clauses to account for micro-partitions “churned” by previous ongoing
Storing Raw and Accelerated Data Efficiently and Inexpensively
Observe benefits from Snowflake’s efficient, compressed columnar data storage. It is not uncommon to see 10x compression levels from the original incoming data size once it comes to rest in Observe. Accelerated Datasets that define metrics using the
interface verb leverage internal structures and clustering methods that realize even higher compression ratios.
Because of the architecture and economics of storing data in Snowflake, Observe does not have any limit to how much data can be stored and has settled on an unprecedented default of 13-month raw data retention for any incoming data size. Customers can also adjust their own retention period on a per-Worksheet and per-Dataset basis — ranging from unlimited to a short period, depending on their business needs.
In Part 3, we will review how Observe manages resources with a focus on resilience, quality, and cost.