Vibe monitoring with Last9 MCP: Ask your agent to fix production issues! Setup →
Last9 Last9

Mar 27th, ‘25 / 11 min read

An Easy and Comprehensive Guide to Prometheus API

Unlock the full potential of Prometheus API with this easy yet comprehensive guide—learn how to query, integrate, and automate monitoring.

An Easy and Comprehensive Guide to Prometheus API

Monitoring is the backbone of any reliable DevOps setup. And if you’re working with monitoring, you’ve likely used Prometheus. This open-source powerhouse has redefined how we track system performance, but are you making the most of its API?

Prometheus is the go-to solution for monitoring container-based environments, particularly in Kubernetes. Its pull-based model and flexible query language provide deep visibility into your systems.

But its real strength lies in the HTTP API—a tool that enables programmatic monitoring, automation, and seamless integration into your workflows. If you're not using it yet, you might be leaving a lot on the table.

What Makes the Prometheus API Worth Your Time?

The Prometheus API isn't just another tool in your tech stack – it's the secret weapon that unlocks next-level monitoring capabilities. With it, you can:

  • Pull metrics data programmatically from any service Prometheus scrapes
  • Create custom dashboards that actually make sense for your specific use cases
  • Automate your alerting workflows based on complex conditions
  • Integrate with your existing tools like Slack, PagerDuty, or custom webhooks
  • Build automation that reacts to metrics in real-time
  • Extend Prometheus capabilities beyond what's available in the UI
  • Implement custom reporting for stakeholders

The API gives you direct access to everything Prometheus collects, letting you work with that data however you want. That's power.

Unlike some monitoring solutions that lock you into their visualization tools, Prometheus follows the Unix philosophy – it does one thing (collecting and storing metrics) extremely well, then exposes everything through an API that lets you build exactly what you need on top.

💡
For practical use cases, check out this guide on Prometheus query examples to sharpen your queries.

Practical API Use Cases

Before jumping into the technical details, let's look at how teams use the Prometheus API:

  • Auto-scaling systems – Triggering infrastructure scaling based on custom metrics
  • Anomaly detection – Feeding metrics into ML systems to catch unusual patterns
  • Business intelligence – Correlating technical metrics with business KPIs
  • Capacity planning – Analyzing long-term trends to forecast resource needs
  • Custom SLO dashboards – Building service level objective tracking specific to your reliability targets

Getting Started with the Prometheus API

Setting up your first connection is straightforward. The Prometheus API runs on HTTP, making it accessible from practically anywhere.

Base URL Structure

Your Prometheus server exposes its API at:

http://<your-prometheus-server>:<port>/api/v1/

For local testing, this might look like:

http://localhost:9090/api/v1/

The API follows RESTful principles with clearly defined endpoints. All responses come in a consistent JSON format with this general structure:

{
  "status": "success",
  "data": {
    // The actual response data varies by endpoint
  }
}

For error cases, you'll get:

{
  "status": "error",
  "errorType": "bad_data",
  "error": "The specific error message"
}

This consistency makes parsing responses straightforward across all API interactions.

Response Format Details

Let's break down what you'll get from different query types:

Range query responses have:

{
  "resultType": "matrix",
  "result": [
    {
      "metric": { "label1": "value1", ... },
      "values": [ 
        [ timestamp1, "string_value1" ],
        [ timestamp2, "string_value2" ],
        ...
      ]
    },
    ...
  ]
}

Instant query responses contain:

{
  "resultType": "vector",
  "result": [
    {
      "metric": { "label1": "value1", ... },
      "value": [ timestamp, "string_value" ]
    },
    ...
  ]
}

Understanding these structures is crucial for correctly parsing the data in your applications.

Authentication Options

Prometheus keeps things simple with these authentication methods:

Method Best For Setup Complexity Implementation Approach
No Auth Testing, isolated networks None Default configuration
Basic Auth Standard protection Low Reverse proxy (Nginx, Apache)
OAuth Enterprise environments Medium OAuth2 Proxy sidecar
TLS Client Certs High-security needs High mTLS with cert management
API Keys Microservice architectures Medium Custom proxy layer

Most teams start with Basic Auth and move to OAuth as they scale.

Prometheus itself doesn't include built-in authentication. Instead, you'll typically deploy it behind a reverse proxy that handles auth. Here's how to set up Basic Auth with Nginx:

server {
    listen 443 ssl;
    server_name prometheus.example.com;

    ssl_certificate /etc/nginx/certs/prometheus.crt;
    ssl_certificate_key /etc/nginx/certs/prometheus.key;

    location / {
        auth_basic "Prometheus";
        auth_basic_user_file /etc/nginx/htpasswd/.htpasswd;
        
        proxy_pass http://localhost:9090;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

For OAuth2, many teams use the oauth2-proxy project as a sidecar:

# docker-compose example
services:
  oauth2-proxy:
    image: quay.io/oauth2-proxy/oauth2-proxy
    command:
      - --provider=github
      - --email-domain=*
      - --upstream=http://prometheus:9090
      - --cookie-secret=your-secret
      - --client-id=your-github-app-id
      - --client-secret=your-github-app-secret
    ports:
      - "4180:4180"

This setup works well for teams already using GitHub or Google for authentication.

💡
Need to configure Prometheus ports correctly? Check out this guide on Prometheus port configuration for a clear breakdown.

Key Prometheus Endpoints You'll Use

The Prometheus API has several endpoints, but these five will handle 90% of your needs:

1. Query Instant Data

GET /api/v1/query

This endpoint gives you a snapshot of metrics right now. Perfect for current status checks.

Parameters:

  • query (required): The PromQL expression to evaluate
  • time: Evaluation timestamp (RFC3339 or Unix timestamp), defaults to current time
  • timeout: Evaluation timeout (e.g., 30s, 1m), defaults to global timeout

Example:

curl 'http://localhost:9090/api/v1/query?query=up'

Response:

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "up",
          "instance": "localhost:9090",
          "job": "prometheus"
        },
        "value": [1675956970.123, "1"]
      },
      {
        "metric": {
          "__name__": "up",
          "instance": "localhost:8080",
          "job": "api-server"
        },
        "value": [1675956970.123, "0"]
      }
    ]
  }
}

2. Query Range Data

GET /api/v1/query_range

When you need metrics over time (like for graphs), this is your go-to.

Parameters:

  • query (required): The PromQL expression to evaluate
  • start (required): Start timestamp (RFC3339 or Unix timestamp)
  • end (required): End timestamp
  • step (required): Query resolution step width in duration format or float seconds
  • timeout: Evaluation timeout, defaults to global timeout

Example:

curl 'http://localhost:9090/api/v1/query_range?query=rate(http_requests_total[5m])&start=2023-01-01T20:10:30.781Z&end=2023-01-01T20:11:00.781Z&step=15s'

Response:

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "http_requests_total",
          "code": "200",
          "handler": "query",
          "instance": "localhost:9090",
          "job": "prometheus"
        },
        "values": [
          [1672602630.781, "3.4"],
          [1672602645.781, "5.6"],
          [1672602660.781, "4.2"]
        ]
      }
    ]
  }
}

The step parameter deserves special attention – it defines the resolution of your data. Too small, and you'll hit performance issues; too large, and you'll miss important details.

3. Series Metadata

GET /api/v1/series

This lets you discover what metrics are available and their labels.

Parameters:

  • match[]: Repeated series selector parameters (required)
  • start: Start timestamp
  • end: End timestamp

Example:

curl 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_cpu_seconds_total'

Response:

{
  "status": "success",
  "data": [
    {
      "__name__": "up",
      "instance": "localhost:9090",
      "job": "prometheus"
    },
    {
      "__name__": "process_cpu_seconds_total",
      "instance": "localhost:9090",
      "job": "prometheus"
    }
  ]
}

4. Label Values

GET /api/v1/label/<label_name>/values

Need to know all possible values for a label? This endpoint has you covered.

Parameters:

  • start: Start timestamp
  • end: End timestamp
  • match[]: Series selector to filter by

Example:

curl 'http://localhost:9090/api/v1/label/job/values'

Response:

{
  "status": "success",
  "data": [
    "prometheus",
    "node-exporter",
    "api-gateway",
    "database"
  ]
}

5. Targets

GET /api/v1/targets

This shows all targets Prometheus is scraping, with their health status.

Parameters:

  • state: Filter by target state (active, dropped, or any)

Example:

curl 'http://localhost:9090/api/v1/targets?state=active'

Response:

{
  "status": "success",
  "data": {
    "activeTargets": [
      {
        "discoveredLabels": {
          "__address__": "localhost:9090",
          "__metrics_path__": "/metrics",
          "__scheme__": "http",
          "job": "prometheus"
        },
        "labels": {
          "instance": "localhost:9090",
          "job": "prometheus"
        },
        "scrapePool": "prometheus",
        "scrapeUrl": "http://localhost:9090/metrics",
        "lastError": "",
        "lastScrape": "2023-02-09T12:30:00.123456789Z",
        "lastScrapeDuration": 0.012345,
        "health": "up"
      }
    ]
  }
}

Additional Useful Endpoints

While the five endpoints above cover most use cases, these can be handy too:

6. Alerts

GET /api/v1/alerts

Lists all active alerts.

7. Rules

GET /api/v1/rules

Lists all recording and alerting rules.

8. Status Config

GET /api/v1/status/config

Dumps the current Prometheus configuration.

9. Metadata

GET /api/v1/metadata

Returns metadata about metrics (helpful for understanding units and semantics).

💡
Explore how to make the most of PromQL with this guide on Prometheus functions for better queries and insights.

Working with PromQL Through the API

The real magic happens when you combine the API with PromQL queries.

Here's a comprehensive chart of essential query patterns that every DevOps engineer should know:

Query Type Example Use Case Notes
Simple http_requests_total Basic metric retrieval Returns all time series with this name
Counter Rate rate(http_requests_total[5m]) Traffic patterns Per-second rate calculated over 5m
Counter Increase increase(http_requests_total[1h]) Hourly totals Total increase over the time period
Gauge Current node_memory_MemFree_bytes Current state Point-in-time value
Gauge Aggregation avg_over_time(node_memory_MemFree_bytes[1h]) Stable representation Smooths fluctuations
Sum sum(node_cpu_seconds_total) Resource utilization Total across all instances
By sum by (instance) (up) Grouped metrics Aggregation with dimensions
Without sum without (job) (up) Remove dimensions Simplify output
Offset rate(http_requests_total[5m] offset 1h) Comparison with past Historical data points
Delta delta(cpu_temp_celsius[2h]) Change detection For gauges (vs rate for counters)
Topk topk(3, cpu_usage) Hotspot identification Find highest values
Bottomk bottomk(3, up) Problem detection Find lowest values
Quantile histogram_quantile(0.95, http_request_duration_seconds_bucket) SLO tracking Calculate percentiles
Prediction predict_linear(node_filesystem_free_bytes[6h], 24 * 3600) Capacity planning Predict future values
Resets resets(counter[5m]) Service restarts Detect counter resets
Time Functions http_requests_total offset 1d Day-over-day comparison Compare to same time yesterday
Label Matching http_requests_total{status=~"5..", method!="POST"} Filtering Multiple conditions with regex
Binary Operators node_memory_MemTotal_bytes - node_memory_MemFree_bytes Derived metrics Arithmetic between metrics
Boolean node_filesystem_free_bytes / node_filesystem_size_bytes < 0.10 Threshold checks Returns 0 or 1

Practical PromQL Examples

Let me break down some practical examples you'll use:

1. Error Rate Calculation

sum(rate(http_requests_total{status_code=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))

This query calculates your error rate – the percentage of requests returning 5xx errors. Super useful for SLOs.

2. Container Memory Usage by Pod

sum by (pod) (container_memory_working_set_bytes{namespace="production"})

Shows memory consumption grouped by pod name in your production namespace.

3. CPU Throttling Detection

sum by (pod) (rate(container_cpu_cfs_throttled_seconds_total[5m])) / sum by (pod) (rate(container_cpu_cfs_periods_total[5m])) > 0.1

Identifies pods experiencing more than 10% CPU throttling, indicating they need more resources.

4. Disk Space Prediction

predict_linear(node_filesystem_free_bytes{mountpoint="/"}[6h], 24 * 3600 * 7)

Predicts free disk space in 7 days based on the trend over the last 6 hours.

5. Apdex Score (Application Performance)

(sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) + sum(rate(http_request_duration_seconds_bucket{le="1.2"}[5m])) / 2) / sum(rate(http_request_duration_seconds_count[5m]))

Calculates an Apdex score where requests under 0.3s are "satisfied" and under 1.2s are "tolerating".

💡
For a deeper understanding of PromQL, check out this guide to Prometheus Query Language and level up your queries.

5 Common PromQL Mistakes

When crafting these queries, watch out for these common pitfalls:

  1. Missing rate() for counters - Counters always increase; you almost always want the rate
  2. Incorrect time windows - Too small windows make noisy data, too large miss important spikes
  3. Missing label context - Aggregating without considering cardinality explosion
  4. Forgetting by() in division - Division between vectors needs matching labels
  5. Unescaped regex characters - Remember to escape special characters in label matches

Advanced PromQL Tips You Need to Know

For more complex monitoring needs:

  1. Create recording rules for complex queries: Recording rules pre-compute expensive expressions, making dashboards faster.

Use absent() to detect missing metrics:

absent(up{job="critical-service"})

Returns 1 if the metric doesn't exist (service is down).

Use subqueries for moving averages:

avg_over_time(rate(http_requests_total[5m])[1h:5m])

This gives you a smoothed rate calculated every 5 minutes over a sliding 1-hour window.

Common API Integration Patterns Worth Knowing

Grafana Integration

Grafana already works with Prometheus out of the box, but you can extend this with custom API calls through Grafana's data source plugins or visualization panels:

// Example fetch in a Grafana panel
async function queryPrometheus(query) {
  const response = await fetch(`http://prometheus:9090/api/v1/query?query=${encodeURIComponent(query)}`);
  const data = await response.json();
  
  if (data.status !== 'success') {
    throw new Error(`Query failed: ${data.error || 'Unknown error'}`);
  }
  
  return data.data.result;
}

// Example usage in a Grafana panel
const metricData = await queryPrometheus('sum(rate(http_requests_total[5m]))');
// Custom visualization logic using D3.js or other libraries

You can also use Grafana's Prometheus data source with variables for dynamic dashboards:

sum by (service) (rate(http_requests_total{environment="$env", datacenter="$dc"}[5m]))

Where $env and $dc are Grafana template variables that users can change.

CI/CD Pipeline Integration

Want to verify your deployment didn't break things? Check it with an API call in your deployment pipeline:

#!/bin/bash
# progressive_deployment.sh

# Deploy the new version to a canary environment
kubectl apply -f canary-deployment.yaml

# Wait for the deployment to stabilize
sleep 60

# Check error rate for the canary version
ERROR_RATE=$(curl -s -H "Authorization: Bearer $PROM_TOKEN" \
  'http://prometheus:9090/api/v1/query?query=sum(rate(http_requests_total{version="canary",status_code=~"5.."}[5m]))/sum(rate(http_requests_total{version="canary"}[5m]))*100' \
  | jq '.data.result[0].value[1] // "0"' \
  | tr -d '"')

# Check latency for the canary version
P95_LATENCY=$(curl -s -H "Authorization: Bearer $PROM_TOKEN" \
  'http://prometheus:9090/api/v1/query?query=histogram_quantile(0.95,sum(rate(http_request_duration_seconds_bucket{version="canary"}[5m]))by(le))' \
  | jq '.data.result[0].value[1] // "0"' \
  | tr -d '"')

# Evaluate if the deployment meets SLOs
if (( $(echo "$ERROR_RATE > 1.0" | bc -l) )) || (( $(echo "$P95_LATENCY > 0.3" | bc -l) )); then
  echo "Canary deployment failed SLO checks!"
  echo "Error rate: $ERROR_RATE% (threshold: 1.0%)"
  echo "P95 latency: ${P95_LATENCY}s (threshold: 0.3s)"
  
  # Rollback the canary deployment
  kubectl delete -f canary-deployment.yaml
  exit 1
else
  echo "Canary deployment passed SLO checks!"
  echo "Error rate: $ERROR_RATE% (threshold: 1.0%)"
  echo "P95 latency: ${P95_LATENCY}s (threshold: 0.3s)"
  
  # Promote canary to production
  kubectl apply -f production-deployment.yaml
fi

This script promotes a canary deployment only if error rates and latency meet your SLOs.

Custom Alerting Logic

Sometimes you need alerts based on complex conditions that aren't easily expressed in standard alerting rules:

#!/usr/bin/env python3
# advanced_alerting.py

import requests
import time
import smtplib
from email.message import EmailMessage
import logging
import os
from datetime import datetime, timedelta

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger('prometheus_alerts')

# Configuration
PROMETHEUS_URL = os.environ.get('PROMETHEUS_URL', 'http://prometheus:9090')
CHECK_INTERVAL = int(os.environ.get('CHECK_INTERVAL', 60))  # seconds
ALERT_COOLDOWN = int(os.environ.get('ALERT_COOLDOWN', 3600))  # seconds
RECIPIENTS = os.environ.get('ALERT_RECIPIENTS', '').split(',')
SMTP_SERVER = os.environ.get('SMTP_SERVER', 'smtp.example.com')

# Alert state management
last_alerts = {}

def query_prometheus(query):
    """Execute a PromQL query against the Prometheus API."""
    try:
        response = requests.get(
            f"{PROMETHEUS_URL}/api/v1/query",
            params={'query': query},
            timeout=10
        )
        response.raise_for_status()
        result = response.json()
        
        if result['status'] != 'success':
            logger.error(f"Query failed: {result.get('error', 'Unknown error')}")
            return None
            
        return result['data']['result']
    except Exception as e:
        logger.exception(f"Error querying Prometheus: {e}")
        return None

def check_business_hours():
    """Only alert during business hours."""
    now = datetime.now()
    # Monday-Friday, 9 AM to 5 PM
    return now.weekday() < 5 and 9 <= now.hour < 17

def check_conditions():
    """Check for complex alert conditions."""
    conditions = [
        # High error rate with high traffic
        {
            'name': 'high_error_rate',
            'query': 'sum(rate(http_requests_total{status_code=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05 and sum(rate(http_requests_total[5m])) > 10',
            'message': 'High error rate detected with significant traffic',
            'severity': 'critical',
            'runbook': 'https://wiki.example.com/runbooks/high-error-rate'
        },
        # Database connection saturation
        {
            'name': 'db_connection_saturation',
            'query': 'max(pg_stat_activity_count) / max(pg_settings_max_connections) > 0.8',
            'message': 'Database connection pool nearing saturation',
            'severity': 'warning',
            'runbook': 'https://wiki.example.com/runbooks/db-connection-pool'
        },
        # Correlated conditions: both API latency and DB latency high
        {
            'name': 'service_degradation',
            'query': 'histogram_quantile(0.95, sum(rate(api_request_duration_seconds_bucket[5m])) by (le)) > 1 and histogram_quantile(0.95, sum(rate(db_query_duration_seconds_bucket[5m]))

## Performance Tips for Heavy API Users

When you're making lots of API calls, keep these tips in mind:

1. **Use query_range wisely** – Specify reasonable step values
2. **Cache common queries** – Don't hammer the API with the same requests
3. **Be selective with labels** – The more labels, the bigger the response
4. **Batch related queries** – Reduce network overhead
5. **Consider federation** – For multi-cluster setups

## API Limitations and Workarounds

Let's be honest about some Prometheus API constraints:

### Time Range Limits

The API can get sluggish with very large time ranges. Break these into smaller chunks:

```python
# Instead of one big query
def get_data_in_chunks(query, start_time, end_time, chunk_hours=6):
    all_data = []
    current = start_time
    
    while current < end_time:
        chunk_end = min(current + chunk_hours * 3600, end_time)
        # API call for just this chunk
        chunk_data = get_prometheus_data(query, current, chunk_end)
        all_data.extend(chunk_data)
        current = chunk_end
        
    return all_data

Rate Limiting

Some environments put limits on API calls. Implement backoff logic:

def api_call_with_backoff(url, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url)
        if response.status_code == 429:  # Too Many Requests
            sleep_time = 2 ** attempt
            time.sleep(sleep_time)
        else:
            return response
    raise Exception("Max retries exceeded")

Security Best Practices

Your Prometheus API is a window into your system's health – protect it:

  1. Never expose it directly to the internet – Use a proxy or API gateway
  2. Implement proper authentication – Basic Auth is the minimum
  3. Use TLS everywhere – Encrypt all API traffic
  4. Apply RBAC – Limit who can access what data
  5. Audit API access – Track who's viewing your metrics

Tying It All Together

The Prometheus API transforms passive monitoring into active observability. By programmatically accessing your metrics, you can build automated responses to system conditions, create custom visualizations, and integrate monitoring into your workflow.

💡
Want to scale Prometheus efficiently? Check out this guide on Thanos and how it extends Prometheus' capabilities.

Last9 and Prometheus

Last9 integrates with Prometheus to enhance your monitoring experience. It connects directly to your Prometheus API, organizing metrics and turning complex data into clear, intuitive visualizations.

With Last9’s Prometheus integration, you can easily spot patterns across your infrastructure and applications—no need to wrestle with complex queries. Get the insights you need, when you need them.

Let’s talk about how we make observability simpler.

💡
If you've any questions about using the Prometheus API in your setup or are stuck on a particular integration join our Discord community where engineers share tips, tricks, and practical solutions.