Scaling Metrics Maturity in a Cloud-Native World

What?

Technology-focused companies across all industries rely on cloud infrastructure & microservices to deliver value to customers; and by extension, profits for the business.

The benefits of a performant infrastructure must be very apparent, and so must their degradations. The measurement and attribution of this performance in a complex software environment is called Observability (o11y).

And hence, Observability has become a first-class engineering citizen in such organizations.

But...

As you climb the ladder of reliability, metrics increase and, correspondingly, the breadth of questions. Modern Time Series systems don't have to grow along a single axis of Cardinality, Coverage, or Retention alone.

Instead, the rate of ingestion and exploration warrants an expansion on all three axes.

  1. Coverage: More metrics are to be observed
  2. Retention: Save metrics for a longer duration
  3. Cardinality: Same metrics for more entities
The outcomes people want to achieve with Observability differ for different company stages. We've broadly classified organizations into three stages.
Stage 1 5 Engineers • 10 Customers • 5 Services
The Initial Stages

In this initial stage, the scale of data is easily manageable. Small teams, low complexity, and low business maturity mean essential incident detection is the focus, rather than deeper intelligence over extended periods. Data retention needs are also at their lowest. There's no pressure to do better centralization, or formally democratizing access to time series data.

Observabiltiy complexity at a Stage 1 company.
Observabiltiy complexity at a Stage 2 company.
Stage 2 30 Engineers • 50 Customers • 20 Services
Operational Readiness

Tolerance towards failures diminishes. There’s more focus towards feature velocity. This is typically where you see an explosion of infra and monitoring bills. To deliver quality customer experiences, teams aspire to find problems before their customers. Teams must implement alerting & dashboarding to detect & diagnose incidents. Friction between business, engineering and infrastructure teams start growing. Formalized DevOps processes are considered at this stage.

Stage 3 100+ Engineers • 1000+ Customers • 100+ Services
Complete Organizational Intelligence

At this scale, metric data is needed by different teams — across product, finance, support, & customer success, to properly run a business. Data growth explodes across all dimensions. Enterprises must dedicate full-time engineering resources to manage their time series database. Greater business maturity moves the focus of engineering teams from code quality to customer experience quality.

Observabiltiy complexity at a Stage 3 company.
Introducing a time series warehouse built to manage scale Levitate
Tiering

Inspired by principles of Data Warehousing, where Data Tiering is a common phenomenon, Levitate introduces tiers for the Time Series storage.

Policies & Governance

Additionally, Levitate offers powerful features to identify time series your team isn't using and trim data according to Data Policies you create. Access Policies enable you to control how tiers are engaged.

Total Cost of Ownership (TCO) Reduction

Our existing customers have more than halved their storage costs with Levitate, excluding advantages over reduced engineering overheads and their management.

The Four Pillars of Levitate

Policy Engine Capability to express, evaluate data storage, and access rules.

Query Routing Engine Based on tokens + source, a query routing engine routes queries to their rightful tiers. The overhead of evaluation is minimal.

Sync Engine Separating write and read channels, guaranteeing ingestion no matter how heavy the read loads are.

Consumption Engine The consumption engine keeps track of metrics being consumed, and makes them available to all other engines.

Jump to Whitepaper
Want to know more? Get the Whitepaper. Find out more about how the scale of metrics and breadth of questions grow as companies mature, the challenges as your business climbs the ladder of reliability and Levitate's managed time-series offering to scale with a software business growth. Download Whitepaper