Performance Impact of High Cardinality in Time-Series DBs

Time-series databases have become the backbone of modern observability, financial analytics, and IoT systems. But there's a common challenge that can bring even the most robust systems to their knees: high cardinality.

When your database starts tracking millions of unique values across various dimensions, performance doesn't just dip—it can collapse entirely.

Let's understand the technical details of what happens when cardinality spikes and how you can architect your systems to handle it.

High Cardinality: Beyond Just "Many Unique Values"

High cardinality isn't merely about having many unique values in a field—it's about the combinatorial explosion that happens when those values interact with time series data.

In a typical time-series database, you're storing:

Metrics (measurements like CPU usage or request latency)
Timestamps (when those measurements occurred)
Dimensions/Tags (contextual information like host, region, customer ID)

High cardinality occurs when these dimensional values create an enormous number of unique time series. For example, if you track:

1,000 servers
50 metrics per server
10 different service names
100 customer IDs

You could potentially end up with 1,000 × 50 × 10 × 100 = 50 million unique time series. Each one requires its own tracking, indexing, and storage considerations.

💡

If you're wondering how cardinality affects database performance, it helps to understand the key differences between high and low cardinality. Read more here.

When RAM Becomes Your Primary Bottleneck

Time-series databases typically maintain in-memory data structures to facilitate rapid queries. As cardinality increases, these structures grow exponentially.

Memory Allocation Patterns Under High Cardinality

When cardinality grows beyond certain thresholds, several memory-intensive components start to struggle:

Index Structures: B-trees or hash tables mapping tag combinations to time series grow with O(n) space complexity relative to the number of unique series.
Metadata Overhead: Each unique time series requires metadata (series name, tag set, creation time, etc.) that consumes 200-500 bytes per series before even storing actual data points.
Caching Layers: Query caches become increasingly ineffective as the working set size exceeds available memory.

Let's look at a memory profile from a production InfluxDB instance:

Component	Memory Usage (10K series)	Memory Usage (1M series)	Memory Usage (100M series)
TSM Index	400MB	6GB	80GB+
Series Metadata	5MB	500MB	50GB
WAL Cache	200MB	1.5GB	15GB+
Query Cache	100MB	500MB	Effectively Disabled
Total	~700MB	~8.5GB	145GB+

At 100 million series, you're looking at a system that requires specialized hardware just to load its indices.

The Query Performance Complexity Challenge

High cardinality doesn't just eat memory—it fundamentally changes query execution patterns and complexity.

How Query Plans Become Pathological

When a time-series database executes a query against high-cardinality data, several execution phases become problematic:

Series Selection Phase: The database must first identify which series match the query's tag predicates. With millions of series, this becomes a massive filtering operation.
Index Traversal: Even with optimized indices, traversing millions of unique keys takes time. The difference between traversing 10,000 vs. 10 million index entries isn't linear—it's often super-linear due to cache misses and disk I/O.
Merging Time Series: Aggregating across high-cardinality dimensions requires merging enormous numbers of time series, creating CPU-bound bottlenecks.

Here's a query execution profile showing where time is spent:

EXPLAIN ANALYZE SELECT mean("value") FROM "cpu" 
WHERE ("region" = 'us-west' OR "region" = 'us-east') 
AND time > now() - 1h 
GROUP BY "host", "service", time(5m)

Query execution breakdown:
- Series selection: 2.453s (was 0.021s at low cardinality)
- Index scan: 1.876s (was 0.015s at low cardinality)
- Series read: 0.312s
- Merge and aggregate: 3.208s (was 0.124s at low cardinality)
- Total execution: 7.849s (was 0.167s at low cardinality)

A 47x performance degradation for the same logical query is common when cardinality jumps by 2-3 orders of magnitude.

💡

Managing high-cardinality data isn't just about query speed—it can quietly drain your storage budget too. Here's how to stay in control.

When Indexing Bottlenecks Write Throughput

While read performance suffers under high cardinality, write performance often degrades even more dramatically.

The Hidden Costs of Each Write Operation

When writing a data point to a high-cardinality time-series database:

Series Lookup: The database must determine if this is a new series or an existing one.
Index Updates: If new, indices must be updated—a potentially expensive operation requiring locks.
Compaction Impact: Background compaction processes, which merge and optimize data files, struggle with fragmented data across millions of series.

This creates a cascading effect on write throughput:

Cardinality Level	Write Throughput	Write Latency	Compaction Backlog
10,000 series	500,000 points/s	0.5ms	Negligible
1 million series	100,000 points/s	2.5ms	Occasional delays
100 million series	10,000 points/s	25ms+	Chronic, growing

At extreme cardinalities, you might face a situation where compaction can never catch up with incoming writes—a death spiral for database performance.

Which Time-Series Database Handles Cardinality Best?

Different time-series databases employ fundamentally different architectures to manage cardinality challenges.

InfluxDB's TSI: Moving From Memory to Disk

InfluxDB's evolution shows how critical cardinality management has become:

InfluxDB 1.x with TSM: Uses an in-memory index that works brilliantly until the ~1 million series threshold.
InfluxDB 1.x with TSI: Introduced a disk-based Time Series Index that can handle 10-100x more series by moving index structures to disk.
InfluxDB 2.0/IOx: Completely redesigned around a columnar storage model inspired by Apache Arrow and DataFusion to better handle high-cardinality workloads.

The TSI architecture uses a multi-level index approach:

Level 1: In-memory mutable index
Level 2: Memory-mapped immutable index files
Level 3: Larger compacted index files

This tiered approach allows it to handle much higher cardinality before performance degrades, though disk I/O becomes the new bottleneck.

💡

When working with high-cardinality data, thinking in terms of a data warehouse rather than a traditional database can make a big difference. Here's why.

TimescaleDB's Hypertable Architecture: Using PostgreSQL's Mature Indexing

TimescaleDB takes a different approach by extending PostgreSQL:

Chunk-based partitioning: Data is automatically partitioned into chunks based on time, making queries on recent data much faster.
PostgreSQL indexes: Leverages B-tree, hash, and GiST indexes for efficient tag queries.
Constraint exclusion: Automatically skips chunks that can't match query conditions.

This architecture handles high cardinality differently:

-- TimescaleDB's approach to high-cardinality queries
EXPLAIN (ANALYZE, BUFFERS) 
SELECT time_bucket('5 minutes', time) AS five_min, 
       hostname, 
       AVG(cpu_usage)
FROM metrics
WHERE time > NOW() - INTERVAL '1 day'
GROUP BY five_min, hostname;

-- Shows chunk exclusion in action:
"Planning Time: 2.051 ms"
"Execution Time: 851.283 ms"
"Chunks excluded during planning: 165"
"Chunks scanned during execution: 12"

By scanning only relevant chunks, TimescaleDB reduces the impact of high cardinality, though it still struggles as cardinality grows beyond millions of series.

Prometheus and VictoriaMetrics: Label-Based vs. Inverted Index

Prometheus and its high-performance alternative VictoriaMetrics take very different approaches:

Prometheus: Uses a custom TSDB with an in-memory label index. Simple but vulnerable to high cardinality.
VictoriaMetrics: Uses an advanced inverted index specifically optimized for high-cardinality time-series data.

The performance difference is stark:

Query Type	Prometheus (1M series)	VictoriaMetrics (1M series)	Prometheus (10M series)	VictoriaMetrics (10M series)
Instant Query	200ms	50ms	2s+	120ms
Range Query	1.5s	400ms	15s+	800ms
High-Cardinality Aggregation	5s+	800ms	Failed	3s

VictoriaMetrics demonstrates that with the right architecture, high cardinality doesn't have to be catastrophic.

💡

Choosing between Prometheus and VictoriaMetrics? Understanding how each handles high-cardinality data is key. Here's a detailed comparison.

How to Prevent Cardinality Problems with Advanced Data Modeling

The most effective solutions to cardinality problems happen at the data modeling stage, long before data hits your database.

Hierarchical Tag Structures: Aggregating Without Losing Information

Instead of recording every possible dimension as a separate tag, build hierarchical structures:

# Poor design (high cardinality)
measurement,region=us-west-2,az=us-west-2a,rack=r42,host=server123,service=auth cpu_usage=72.5

# Better design (lower cardinality)
measurement,host_path=/us-west-2/us-west-2a/r42/server123,service=auth cpu_usage=72.5

This allows you to query at any level of the hierarchy while keeping the total number of tags manageable.

Controlled Cardinality Through Domain-Specific Normalization

Normalize high-cardinality fields based on domain knowledge:

IP Addresses: Store subnet rather than individual IPs
User IDs: Store cohorts, segments, or hash buckets
URLs: Store route patterns rather than full URLs with parameters
Session IDs: Create time-bucketed session groups

Implementation example:

-- Instead of storing full URLs
INSERT INTO http_metrics (url, status_code, response_time) 
VALUES ('/products/12345?user=789&source=email', 200, 127);

-- Store normalized route patterns
INSERT INTO http_metrics (url_pattern, status_code, response_time) 
VALUES ('/products/:id', 200, 127);

-- Still allows powerful aggregation
SELECT url_pattern, AVG(response_time), COUNT(*) 
FROM http_metrics 
GROUP BY url_pattern;

This reduces cardinality by orders of magnitude while preserving analytical power.

Strategic Downsampling: Reducing Cardinality Over Time

Implement tiered storage with different cardinality levels based on age:

Raw tier (1-7 days): Full cardinality
Intermediate tier (8-30 days): Reduced cardinality (drop some dimensions)
Long-term tier (31+ days): Minimal cardinality (highly aggregated)

Implementation example:

-- Continuous aggregate for downsampling with reduced cardinality
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT 
  time_bucket('1 hour', time) AS hourly,
  service,
  region, 
  -- Drop host-level cardinality in this view
  AVG(cpu_usage) AS avg_cpu,
  MAX(cpu_usage) AS max_cpu,
  MIN(cpu_usage) AS min_cpu,
  COUNT(*) AS sample_count
FROM metrics
GROUP BY hourly, service, region;

-- Set retention policy for high-cardinality raw data
SELECT add_retention_policy('metrics', INTERVAL '7 days');

This approach gives you full detail for recent data while automatically reducing both storage and cardinality for historical data.

💡

Struggling with high-cardinality data in Prometheus? This guide covers downsampling and aggregation techniques to keep queries fast and storage under control.

Performance Optimization Techniques When Reducing Cardinality Isn't an Option

Sometimes you simply can't reduce cardinality. Here are advanced techniques to optimize performance anyway:

How to Optimize Resources for Partial Index

Rather than indexing every possible tag combination, create partial indices for the most common query patterns:

-- TimescaleDB/PostgreSQL partial index example
CREATE INDEX ON metrics(host, time DESC)
WHERE service = 'api-gateway';

-- This dramatically speeds up queries for a specific service
-- without wasting resources on indexing everything

Partial indices can reduce index size by 90%+ while still accelerating the most important queries.

Precompute Common Aggregations for Faster Queries

For frequently accessed aggregations across high-cardinality dimensions, create materialized query tables:

-- Create a materialized view that pre-aggregates data
CREATE MATERIALIZED VIEW daily_service_metrics AS
SELECT 
  DATE_TRUNC('day', time) AS day,
  service,
  region,
  AVG(latency) AS avg_latency,
  P95(latency) AS p95_latency,
  COUNT(*) AS request_count
FROM request_logs
GROUP BY day, service, region;

-- Create an index on the materialized view
CREATE INDEX ON daily_service_metrics(service, day);

-- Refresh on a schedule
REFRESH MATERIALIZED VIEW daily_service_metrics;

This trades storage space and some write overhead for dramatic query performance improvements on common access patterns.

Rewriting for Cardinality-Awareness

Rewrite queries to limit the impact of high cardinality:

-- Inefficient query (scans all series)
SELECT mean("cpu_usage") FROM "system" 
WHERE time > now() - 1h 
GROUP BY "host";

-- More efficient query (uses tag value pre-filtering)
SELECT mean("cpu_usage") FROM "system" 
WHERE time > now() - 1h 
  AND "datacenter" = 'us-west' 
GROUP BY "host";

Always filter on lower-cardinality tags first to reduce the working set before dealing with high-cardinality dimensions.

💡

Wondering why your monitoring bills keep climbing? This breakdown explains the hidden costs and how to control them.

Advanced Tooling for Cardinality Management

Modern time-series databases come with specialized tools for managing cardinality issues.

Deep Cardinality Inspection Tools

These tools help you understand exactly where your cardinality is coming from:

# InfluxDB cardinality exploration
influx_inspect cardinality -db mydb -limit 20

# Output:
Tag key         Approx. unique values
---------       -------------------
container_id    24,532,850
request_id      18,945,327
pod_name        5,624,103
path            2,345,678
host            124,532
service         86
environment     12
region          8

Continuous Cardinality Monitoring

Set up monitoring for cardinality itself:

-- Query to track cardinality growth in TimescaleDB
SELECT 
  time_bucket('1 day', created_at) AS day,
  COUNT(DISTINCT series_id) AS new_series_count
FROM _timescaledb_catalog.hypertable_replica
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY day
ORDER BY day;

Implement alerts when cardinality grows beyond expected rates to catch problems early.

Automated Cardinality Management

Advanced systems implement automated cardinality controls:

// Example of a cardinality limiter in application code
function addMetric(metric, tags, value) {
  // Hash the tags to get a consistent series identifier
  const seriesHash = hashTags(tags);
  
  // Check if this hash is in our allow list
  if (cardinalityManager.isAllowed(seriesHash)) {
    // Send the metric normally
    client.writePoint(metric, tags, value);
  } else if (cardinalityManager.shouldSample(seriesHash)) {
    // This series isn't in our allow list, but we'll sample it
    // at a lower rate to track its importance
    client.writePoint(metric, tags, value);
    
    // Record that we've seen this series
    cardinalityManager.recordSeries(seriesHash, tags);
  }
  // Otherwise silently drop the point
}

// Periodically adjust the allow list based on observed frequency
setInterval(() => {
  cardinalityManager.promoteFrequentSeries();
  cardinalityManager.demoteRareSeries();
}, 3600 * 1000); // Every hour

This approach allows systems to automatically maintain cardinality within acceptable limits.

How Time-Series Databases Are Evolving for High Cardinality

Database vendors recognize that high cardinality is a growing challenge and are making architectural changes to address it.

Column-Oriented Storage Models

Newer time-series databases are moving toward columnar storage:

InfluxDB IOx: Based on Apache Arrow and DataFusion
TimescaleDB Hyperscale: Adding columnar capabilities to TimescaleDB
QuestDB: Using column-oriented SIMD-optimized storage

Columnar storage has significant advantages for high-cardinality time-series data:

Better compression ratios (10-100x for many workloads)
Improved query performance through vectorized execution
More efficient filtering across high-cardinality dimensions

Approximate Query Processing

Many use cases don't require exact answers. Approximate query processing can provide dramatic performance improvements:

-- Exact (slow on high cardinality)
SELECT COUNT(DISTINCT user_id) FROM events;

-- Approximate (much faster, slight accuracy tradeoff)
SELECT approx_count_distinct(user_id) FROM events;

Modern time-series databases are adding more approximate algorithms like HyperLogLog for cardinality estimation, T-Digest for percentiles, and Count-Min Sketch for frequency estimation.

Hardware Acceleration

Specialized hardware is becoming more important for high-cardinality workloads:

GPU acceleration: For parallel processing of filtering operations
FPGA-based indexing: For ultra-fast series lookup
Persistent memory: Bridging the gap between RAM and SSD

These advancements may eventually change the economics of high-cardinality time series.

💡

High cardinality can be a challenge, but it doesn’t have to break your database. Here's how to manage it effectively.

When to Consider Alternative Database Technologies

At extreme cardinality levels, traditional time-series databases may not be the right solution. Consider these alternatives:

Specialized High-Cardinality Time-Series Databases

VictoriaMetrics: Built from the ground up for high cardinality
ClickHouse: Column-oriented DBMS with excellent time-series capabilities
Apache Druid: Designed for sub-second OLAP queries on high-cardinality data

Event Log Oriented Solutions

For some use cases, switching from metrics to logs may be more appropriate:

Elasticsearch: For high-cardinality text-heavy time-series data
Loki: Log aggregation system with label-based indexing
Hadoop/Spark: For batch processing of extreme-cardinality historical data

The decision factors should include:

Query patterns (real-time vs. batch)
Retention requirements
Aggregation needs
Budget constraints

Conclusion

High cardinality doesn’t have to slow down your time-series database. At Last9, we help teams handle high-cardinality observability at scale—trusted by Disney+ Hotstar, CleverTap, and Replit.

We’ve monitored 11 of the 20 largest live-streaming events in history, integrating with OpenTelemetry and Prometheus to unify metrics, logs, and traces. The result? Better performance, lower costs, and real-time insights for smarter monitoring and alerting.

Book sometime with us if you want to chat about more about it!

💡

Have you tackled high cardinality issues in your time-series data? Join our Discord community and share your wins and war stories!