What Is Multicloud? (And Why We Need Multicloud Observability)

What is multicloud observability? Is it an exciting new trend developing in the tech world or just the latest buzzword to be beaten to death by marketing departments of observability and monitoring vendors? As you likely guessed, the answer lies somewhere in the middle. Like many popular buzzwords in tech, it originates from an actual need felt by organizations, but it can be used to mean different things in various contexts. So if one is using multiple clouds does one need multicloud observability, how is it achieved, and does it differ greatly from “regular” observability?

In a nutshell, as public clouds have become mainstream, the use of services from multiple cloud providers in any given organization has become quite common. Thus the term “multicloud” was born. Access to innovation from leading cloud providers is a good thing, however, multicloud environments are a double-edged sword. There are many reasons why organizations end up with multiple clouds in their environments, and there are also many challenges to operating them simultaneously. Observability is one method that can help smooth out some of those operational challenges and make day-to-day life a lot less chaotic. Enter the term “multicloud observability”. 

If we were to explain multicloud observability as succinctly as possible, we’d say it’s this: making services that span multiple public clouds observable to reduce the time needed to detect and resolve issues with them. However, the topic is a bit more complicated than that, and knowing what multicloud observability is and how to achieve it are two separate things. In this ebook we’ll cover:

  • What is multicloud, and how does it occur
  • Why multicloud needs observability, and what are the specific considerations
  • How you can benefit from multicloud observability

What Is Multicloud?

Before we can talk about multicloud observability, we have to first talk about multicloud. Here we’ll define multicloud, look at the catalysts that lead to its “adoption”, and how it can create challenges for organizations that want to achieve observability.

In the Beginning… There Was Hybrid Cloud

Before there was multicloud there was hybrid cloud. Hybrid cloud was commonly used to refer to the usage of both infrastructure residing on-premises in data centers and in a public cloud. There was a time when hybrid cloud was somewhat controversial as companies were not yet accustomed to surrendering significant ownership or infrastructure to a third party. It has since become the norm at many large enterprises as some applications are migrated to the cloud or new apps are built there, while some legacy or security-sensitive ones remain on-premises. For smaller organizations, it is not unusual to embrace the public cloud resources from the outset and have minimal (if any) on-premises footprint.

On-Premises vs Hybrid vs Multicloud

Naturally, things have continued to evolve as cloud use has become more accepted and there are many flavors of clouds to choose from. There are on-premises and colocated private clouds, public cloud providers, and even  SaaS services running in public clouds. Using some combination of services running in these locations effectively constitutes multicloud adoption. Generally speaking, multicloud is used to refer to the use of multiple public clouds within one organization. However, much like how a rectangle can be a square, a hybrid cloud might be considered multicloud if what’s on-premises is “cloud-like”, which has become even more viable with the advent of on-premises and hybrid services from public clouds such as AWS Outposts or GCP Anthos. Suffice it to say that multicloud is an expansive term whose meaning is not always singular but is typically associated with the use of multiple public clouds and that’s how we’ll use it for the duration of this ebook.

Reality Check

So how real is multicloud? Well, according to a blog published by analyst firm ESG, as of 2022 their research indicates 86% of current public cloud infrastructure users are implementing a multi-cloud strategy, with 2 to 4 public clouds being the average. Regardless of the precise number, it’s safe to say that using multiple public clouds is quite common and multicloud is a very real thing.

When we talk about “the cloud”, users (in North America at least) tend to think about the three major public cloud providers. Amazon Web Services, Microsoft Azure, and Google Cloud Platform are generally considered the leading public cloud providers. However, there are many many others such as Alibaba, IBM, Oracle, DigitalOcean, Rackspace, etc… 

Secondary and tertiary service providers, or anything that’s not the primary cloud provider, can and do make up a significant portion of some companies’ cloud spend. In Stack Overflow’s 2022 Developer Survey, 55% of professional developers said they used AWS extensively in the past year, followed by 30% for Microsoft Azure and 26% for Google Cloud Platform. Of the 22,357 professional developer respondents that cited AWS usage, 26% said they wanted to work with GCP in the coming year and 20% said they wanted to work with Azure. It’s easy to see how secondary providers can establish a foothold in an organization even if another public cloud is already the dominant supplier. However, this does not always happen due to a conscious decision to implement multi or hybrid cloud.

How Multicloud Happens

We say multicloud “happens” because it is not always an agreed-upon strategic or intentional choice, and often organizations find themselves using multiple clouds for various reasons. Here are a few:

  • Reliability and redundancy– If you’re concerned with reliability you might keep data in multiple regions of the same provider. This line of thinking can also lead organizations to keep applications or data in multiple clouds to further insulate themselves from the effects of an outage that they may have no control over.
  • Access to proprietary services– Different cloud providers offer a swathe of similar services, for example, you’ll find cheap object storage services from any Infrastructure-as-a-Service vendor. However, companies also differentiate with unique technologies (one example would be TPUs only being available in GCP). The need to access specific services can drive the adoption of a new platform.
  • Shadow IT–  The term shadow IT has negative connotations, but users signing up for services on their own accord without approval is a part of the due diligence process in investigating options to meet a particular need. Whatever the reason, not everyone always knows when a new cloud has been brought online within the organization.
  • Provider-specific skill sets– Core services may have fundamental similarities between clouds, but using them can require their own skillset. Personnel may be more experienced with a certain provider and this can be a factor that influences the additional use of or switch to a new cloud provider at that organization.
  • Latency and data locality– As much as we try we can’t cheat the laws of physics. If a service is latency sensitive and your current provider’s closest region isn’t close enough, you might have to look at alternatives. For example, if your business is in Perth Australia, the closest AWS region is on the other side of the country in Sydney. That might seem like an extreme example, but it demonstrates the real-world impact distance can have on operations. Location can also be a factor in data locality regulations when data needs to be kept in a certain physical region, sometimes this can require using other service providers.

It’s easy to see how one can end up with a single service in use from another cloud provider, and that can quickly scale in scope. Usage can spread from a single user to the rest of their team and beyond, and one service can quickly grow to multiple services in use. As described above some of the reasons that multicloud happens are positive ones, the organization is deriving some benefit from doing so. However, this kind of dynamic can result in complex operational, security, and compliance challenges that can be perilous if left unaddressed.

Why We Need Multicloud Observability

So for better or worse, your organization has implemented multicloud, how does that tie into observability? If you’re new to observability you can read a quick primer here. Observability can help your organization address numerous operational challenges to minimize downtime. However, many organizations are still in the early stages of their observability journey and relying on legacy tools that can provide an incomplete picture of their environment. To make matters worse, multicloud presents new challenges that can make maintaining visibility and performing troubleshooting even harder. This makes achieving observability that much more important.

More Clouds, More Data, More Problems?

If your IT team is using multiple clouds, then you probably want observability that can span all of them to provide actionable insights and not just highlight performance issues in singular clouds. This is easier said than done, and many organizations likely think they have observability when they just have monitoring implemented. Our 2022 State of Observability report shows that the complexity of the environment is the number one hurdle to achieving observability. As environments grow more complex, the need for observability becomes more acute, but so does the difficulty in gaining it.

Complexity As Primary Observability Challenge

Observability is all about the data. It shouldn’t matter the number of sources or where the data is coming from when it’s time to analyze it. Ideally, more data is a good thing for troubleshooting. However, getting things set up will require more consideration if you have a highly distributed environment. Thanks to the growing adoption of microservices, Kubernetes, serverless, and other cloud-native technologies complex distributed applications are on the rise. The more distributed and complex your environment becomes then the harder it can be to maintain visibility into all parts of it. Manual instrumentation of code for observability was the fastest-growing challenge almost doubling from 17% to 30% year-over-year.

With great complexity, you also run the risk of operational silos wherein few people in your organization understand the entire environment and are instead focused on the parts that are most relevant to their day-to-day. This inevitably causes problems when it’s time to troubleshoot. If you have observability, then it helps you head off these problems.

Don’t Let Your Clouds Be Silos

There are services unique to each public cloud, but many services can be more similar than they are different. You may be using Amazon S3 and Azure Blob Storage, and although they are distinct and proprietary services at the end of the day, they are both object storage. You may be looking at similar telemetry data, but because those services exist in different public clouds separate from one another, they are silos.

It is likely you are even collecting and aggregating data with cloud-native tooling, for example, CloudWatch in AWS, Azure’s Monitor, or GCP Operations Suite (fka Stackdriver). It would be time-consuming and provide diminished value if your workflow involves logging into the native monitoring suite of a given cloud individually every time something breaks. You’d also have to have knowledge of multiple tools.

If a cloud-specific monitoring service is the end destination for that data and the place you do your analysis, you’re also losing out on the value of being able to put that data into context with the data from your other clouds. To further complicate the matter even if you are forwarding data to a third-party observability product even then it might not be effectively correlated with the rest of your data. If your organization’s current tooling forces you to spend time tagging and indexing data before you can get much use out of it then that’s time being taken away from more valuable tasks, that’s time your SREs, DevOps engineers, or developers don’t have to waste. It would be bad enough if time was the only factor but if your organization was ever audited for compliance or security-related reasons, you don’t want to search through multiple tools for answers and sifting through data without context.

Most vendors label their products as a “single pane of glass” and yet those same products tend to have inherent silos in their architecture. Your data may be split into different data stores based on type, complicating the correlation process. Or you may find that your “observability” tools are largely just monitoring tools and you can’t use the dashboards they provide to dig into the data coming from your different cloud services. The bottom line is if you lack visibility across your environment then collecting data is a challenge and if you have visibility but the data is siloed and without proper context then using that data is a challenge. A cloud provider agnostic strategy will help you maintain visibility across your environment. The use of open-source data collectors such as Fluentbit and OpenTelemetry has now made it even easier to take a vendor-agnostic approach to get data from the sources you need to the observability service of your choice.

Reaping the Benefits of Multicloud Observability

Multicloud Data in Context

Observe’s approach is The Observability Cloud, which is about having all your data in one place without having a mishmash of backends and silos for different data types, and not paying for multiple tools to accommodate for various use cases. The Observability Cloud is comprised of the Data Lake, Data Graph, and Data Apps.

Observe Apps bundle relevant integrations for all three major cloud providers – AWS, GCP, and Azure. For example, the AWS App streamlines the process of collecting telemetry from various AWS services while making them more discoverable. With this app, you can easily monitor and troubleshoot many of your favorite AWS services in Observe, which means less time troubleshooting and more time building applications and services for your customers. Similarly, we have Azure and GCP apps and the range of services covered by those apps is constantly expanding.

Once that telemetry is on its way we unify all data within the Data Lake in our Observability Cloud. Data is kept in one place and context is understood and mapped via the Data Graph. The Data Graph shows you all your connected Datasets, the things you care about, ranging anywhere from S3 buckets to Kubernetes clusters on EKS. The combination of having all the data in the Data Lake and searchable via the Data Graph allows us to provide schema-on-demand. When it comes to searching and filtering your vast amount of data your single pane of glass can give you the exact viewpoint you need when you need it.

Data Graph For Multicloud

To quote one of our customers, “It’s the only place that aggregates logs across a lot of data sources, so when I don’t know where to look, I can start here.” Another of our customers, Linedata, has gotten value from being able to see across their many user accounts within AWS to eliminate blind spots. The problem solved in that scenario can grow significantly once you introduce more clouds (and more accounts within those clouds) to an environment. Without complete multicloud observability, you can end up with security blind spots or encounter dead ends as you try to troubleshoot.

Economics That Work at Multicloud Scale

As more apps and their dependencies are spread across more clouds, that all adds up to a lot of telemetry data. Even if you are able to make it all observable and correlate all that data to make it easy to navigate and understand in times of crisis, there is still the matter of scale, and ultimately cost. Typically a legacy solution would charge based on the volume of telemetry data ingested and indexed, per user/seat, or on the number of hosts monitored (or a combination of those models). In that situation then more data equals more cost regardless of the value you get out of that data. That’s an expensive proposition in the context of multicloud growth. There are solutions that offer band-aids to try and limit data ingest volume and archive data for long-term storage, but this simply pushes more complexity onto users and makes it harder for them to observe their applications and infrastructure by leading users to discard potentially relevant data. 

Observe has taken a unique approach and built on a modern, cloud-native, architecture that enables a unique usage-based pricing model. All data is ingested into an Amazon S3-based Data Lake, compressed 10x, and stored for 13 months. Observe then accelerates frequently accessed data into the Data Graph. Queries either to accelerate data or to query the Data Graph are executed efficiently via Observe’s multi-tenant implementation on top of the Snowflake Data Cloud. All queries across all customers, in all companies, share the same Snowflake infrastructure. This allows us to separate compute and storage and bill customers based on the compute they consume to run queries and accelerate data. 

Concerns regarding runway usage or rogue users are alleviated by the implementation of both passive and active cost controls.  Users are prompted to confirm expensive queries and admins can set credit limits to manage usage to an annual budget number. Because our usage-based pricing system has cost controls it means customers can ingest the data they need without having to worry in advance whether it’s the “right data” or worry that data volume alone will blow their budget. That’s economics at multicloud scale.

 


If you like the idea of being able to unify your multicloud data to expedite your queries, without having to compromise on data volume to meet your budget, then request access click here.