Standard Deviation Alerting

Standard deviation alerting automatically detects unusual behavior in your services by comparing current metrics against historical patterns. Instead of setting fixed thresholds that may not account for normal traffic variations, these alerts adapt to your service’s baseline behavior and trigger when metrics deviate significantly from the norm.

Access Alerting in your Last9 dashboard to create alerts using the standard deviation macro.

How It Works

The adaptive_std_cmp macro is a built-in Last9 function that returns a boolean value (0 or 1) indicating whether your metric is behaving anomalously. Use this macro as your query when creating a static threshold alert in Alerting.

Implementation Pattern

adaptive_std_cmp(query, std_factor, duration)

Parameters:

query: Your base PromQL metric query
std_factor: Number of standard deviations from mean (typically 2-3)
duration: Time window for statistical calculation (without quotes)

Output: Boolean value where 1 = anomaly detected, 0 = normal behavior

Setting Up Standard Deviation Alerts

Navigate to Alerting and click Create Alert
Select Static Threshold as your alert type
Enter your adaptive_std_cmp query in the query field
Set your threshold to 0.5 (alert when output goes above 0.5, meaning anomaly detected)
Configure sensitivity with bad out of total minutes

Recommended Configuration

Threshold Setting

Recommended threshold: 0.5
Why: Since the macro outputs 0 (normal) or 1 (anomaly), setting threshold at 0.5 triggers the alert when an anomaly is detected

Sensitivity (Bad out of Total Minutes)

For critical services: 1 out of 3 minutes (high sensitivity)
For general monitoring: 2 out of 5 minutes (balanced)
For noisy metrics: 3 out of 10 minutes (low sensitivity, reduces false positives)

Common Use Cases

Response Time Anomalies

adaptive_std_cmp(trace_service_response_time{service_name="prod-api-service"}, 2, 10m)

Throughput Anomalies

adaptive_std_cmp(sum(trace_endpoint_count{service_name="prod-api-service",span_kind="SPAN_KIND_SERVER"}), 3, 15m)

External Service Performance

adaptive_std_cmp(trace_client_duration{service_name="prod-api-service",net_peer_name="external_host"}, 2.5, 5m)

Understanding Your Alerts

When the adaptive_std_cmp query returns 1, it means your metric has deviated beyond the specified number of standard deviations from its historical average over the defined duration. The alert will fire based on your threshold (0.5) and sensitivity settings.

Example Alert Behavior:

Query returns 1 (anomaly detected)
Threshold 0.5 is exceeded
If sensitivity is “2 out of 5 minutes”, the alert fires when 2 minutes within any 5-minute window show anomalous behavior

Standard Deviation Factor Guidelines

std_factor = 2: Catches ~95% of normal variations (more sensitive)
std_factor = 2.5: Balanced approach for most services
std_factor = 3: Catches ~99.7% of normal variations (less sensitive, critical alerts only)

Duration Window Guidelines

5-10 minutes: Fast-changing services, real-time monitoring
15-30 minutes: Standard web services
1+ hours: Batch jobs, daily patterns

Best Practices

Start with std_factor=2.5 and duration=15m for most services
Use shorter durations for latency-sensitive applications
Use longer durations for services with natural daily/weekly patterns
Monitor alert frequency during initial setup and adjust sensitivity accordingly
Combine with traditional threshold alerts for comprehensive coverage

Configuring an Alert - Complete guide to setting up alert rules
Anomalous Pattern Detection Guide - Advanced algorithms for specific anomaly patterns

Standard deviation alerting provides a middle ground between static thresholds and advanced pattern detection algorithms. Use it when you need adaptive behavior but don’t require the specialized pattern matching of high/low spikes, level changes, or trend deviation algorithms.

Troubleshooting

Please get in touch with us on Discord or Email if you have any questions.