An Easy and Comprehensive Guide to Prometheus API

Monitoring is the backbone of any reliable DevOps setup. And if you’re working with monitoring, you’ve likely used Prometheus. This open-source powerhouse has redefined how we track system performance, but are you making the most of its API?

Prometheus is the go-to solution for monitoring container-based environments, particularly in Kubernetes. Its pull-based model and flexible query language provide deep visibility into your systems.

But its real strength lies in the HTTP API—a tool that enables programmatic monitoring, automation, and seamless integration into your workflows. If you're not using it yet, you might be leaving a lot on the table.

What Makes the Prometheus API Worth Your Time?

The Prometheus API isn't just another tool in your tech stack – it's the secret weapon that unlocks next-level monitoring capabilities. With it, you can:

Pull metrics data programmatically from any service Prometheus scrapes
Create custom dashboards that actually make sense for your specific use cases
Automate your alerting workflows based on complex conditions
Integrate with your existing tools like Slack, PagerDuty, or custom webhooks
Build automation that reacts to metrics in real-time
Extend Prometheus capabilities beyond what's available in the UI
Implement custom reporting for stakeholders

The API gives you direct access to everything Prometheus collects, letting you work with that data however you want. That's power.

Unlike some monitoring solutions that lock you into their visualization tools, Prometheus follows the Unix philosophy – it does one thing (collecting and storing metrics) extremely well, then exposes everything through an API that lets you build exactly what you need on top.

💡

For practical use cases, check out this guide on Prometheus query examples to sharpen your queries.

Practical API Use Cases

Before jumping into the technical details, let's look at how teams use the Prometheus API:

Auto-scaling systems – Triggering infrastructure scaling based on custom metrics
Anomaly detection – Feeding metrics into ML systems to catch unusual patterns
Business intelligence – Correlating technical metrics with business KPIs
Capacity planning – Analyzing long-term trends to forecast resource needs
Custom SLO dashboards – Building service level objective tracking specific to your reliability targets

Getting Started with the Prometheus API

Setting up your first connection is straightforward. The Prometheus API runs on HTTP, making it accessible from practically anywhere.

Base URL Structure

Your Prometheus server exposes its API at:

http://<your-prometheus-server>:<port>/api/v1/

For local testing, this might look like:

http://localhost:9090/api/v1/

The API follows RESTful principles with clearly defined endpoints. All responses come in a consistent JSON format with this general structure:

{
  "status": "success",
  "data": {
    // The actual response data varies by endpoint
  }
}

For error cases, you'll get:

{
  "status": "error",
  "errorType": "bad_data",
  "error": "The specific error message"
}

This consistency makes parsing responses straightforward across all API interactions.

Response Format Details

Let's break down what you'll get from different query types:

Range query responses have:

{
  "resultType": "matrix",
  "result": [
    {
      "metric": { "label1": "value1", ... },
      "values": [ 
        [ timestamp1, "string_value1" ],
        [ timestamp2, "string_value2" ],
        ...
      ]
    },
    ...
  ]
}

Instant query responses contain:

{
  "resultType": "vector",
  "result": [
    {
      "metric": { "label1": "value1", ... },
      "value": [ timestamp, "string_value" ]
    },
    ...
  ]
}

Understanding these structures is crucial for correctly parsing the data in your applications.

Authentication Options

Prometheus keeps things simple with these authentication methods:

Method	Best For	Setup Complexity	Implementation Approach
No Auth	Testing, isolated networks	None	Default configuration
Basic Auth	Standard protection	Low	Reverse proxy (Nginx, Apache)
OAuth	Enterprise environments	Medium	OAuth2 Proxy sidecar
TLS Client Certs	High-security needs	High	mTLS with cert management
API Keys	Microservice architectures	Medium	Custom proxy layer

Most teams start with Basic Auth and move to OAuth as they scale.

Prometheus itself doesn't include built-in authentication. Instead, you'll typically deploy it behind a reverse proxy that handles auth. Here's how to set up Basic Auth with Nginx:

server {
    listen 443 ssl;
    server_name prometheus.example.com;

    ssl_certificate /etc/nginx/certs/prometheus.crt;
    ssl_certificate_key /etc/nginx/certs/prometheus.key;

    location / {
        auth_basic "Prometheus";
        auth_basic_user_file /etc/nginx/htpasswd/.htpasswd;
        
        proxy_pass http://localhost:9090;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

For OAuth2, many teams use the oauth2-proxy project as a sidecar:

# docker-compose example
services:
  oauth2-proxy:
    image: quay.io/oauth2-proxy/oauth2-proxy
    command:
      - --provider=github
      - --email-domain=*
      - --upstream=http://prometheus:9090
      - --cookie-secret=your-secret
      - --client-id=your-github-app-id
      - --client-secret=your-github-app-secret
    ports:
      - "4180:4180"

This setup works well for teams already using GitHub or Google for authentication.

💡

Need to configure Prometheus ports correctly? Check out this guide on Prometheus port configuration for a clear breakdown.

Key Prometheus Endpoints You'll Use

The Prometheus API has several endpoints, but these five will handle 90% of your needs:

1. Query Instant Data

GET /api/v1/query

This endpoint gives you a snapshot of metrics right now. Perfect for current status checks.

Parameters:

query (required): The PromQL expression to evaluate
time: Evaluation timestamp (RFC3339 or Unix timestamp), defaults to current time
timeout: Evaluation timeout (e.g., 30s, 1m), defaults to global timeout

Example:

curl 'http://localhost:9090/api/v1/query?query=up'

Response:

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "up",
          "instance": "localhost:9090",
          "job": "prometheus"
        },
        "value": [1675956970.123, "1"]
      },
      {
        "metric": {
          "__name__": "up",
          "instance": "localhost:8080",
          "job": "api-server"
        },
        "value": [1675956970.123, "0"]
      }
    ]
  }
}

2. Query Range Data

GET /api/v1/query_range

When you need metrics over time (like for graphs), this is your go-to.

Parameters:

query (required): The PromQL expression to evaluate
start (required): Start timestamp (RFC3339 or Unix timestamp)
end (required): End timestamp
step (required): Query resolution step width in duration format or float seconds
timeout: Evaluation timeout, defaults to global timeout

Example:

curl 'http://localhost:9090/api/v1/query_range?query=rate(http_requests_total[5m])&start=2023-01-01T20:10:30.781Z&end=2023-01-01T20:11:00.781Z&step=15s'

Response:

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "__name__": "http_requests_total",
          "code": "200",
          "handler": "query",
          "instance": "localhost:9090",
          "job": "prometheus"
        },
        "values": [
          [1672602630.781, "3.4"],
          [1672602645.781, "5.6"],
          [1672602660.781, "4.2"]
        ]
      }
    ]
  }
}

The step parameter deserves special attention – it defines the resolution of your data. Too small, and you'll hit performance issues; too large, and you'll miss important details.

3. Series Metadata

GET /api/v1/series

This lets you discover what metrics are available and their labels.

Parameters:

match[]: Repeated series selector parameters (required)
start: Start timestamp
end: End timestamp

Example:

curl 'http://localhost:9090/api/v1/series?match[]=up&match[]=process_cpu_seconds_total'

Response:

{
  "status": "success",
  "data": [
    {
      "__name__": "up",
      "instance": "localhost:9090",
      "job": "prometheus"
    },
    {
      "__name__": "process_cpu_seconds_total",
      "instance": "localhost:9090",
      "job": "prometheus"
    }
  ]
}

4. Label Values

GET /api/v1/label/<label_name>/values

Need to know all possible values for a label? This endpoint has you covered.

Parameters:

start: Start timestamp
end: End timestamp
match[]: Series selector to filter by

Example:

curl 'http://localhost:9090/api/v1/label/job/values'

Response:

{
  "status": "success",
  "data": [
    "prometheus",
    "node-exporter",
    "api-gateway",
    "database"
  ]
}

5. Targets

GET /api/v1/targets

This shows all targets Prometheus is scraping, with their health status.

Parameters:

state: Filter by target state (active, dropped, or any)

Example:

curl 'http://localhost:9090/api/v1/targets?state=active'

Response:

{
  "status": "success",
  "data": {
    "activeTargets": [
      {
        "discoveredLabels": {
          "__address__": "localhost:9090",
          "__metrics_path__": "/metrics",
          "__scheme__": "http",
          "job": "prometheus"
        },
        "labels": {
          "instance": "localhost:9090",
          "job": "prometheus"
        },
        "scrapePool": "prometheus",
        "scrapeUrl": "http://localhost:9090/metrics",
        "lastError": "",
        "lastScrape": "2023-02-09T12:30:00.123456789Z",
        "lastScrapeDuration": 0.012345,
        "health": "up"
      }
    ]
  }
}

Additional Useful Endpoints

While the five endpoints above cover most use cases, these can be handy too:

6. Alerts

GET /api/v1/alerts

Lists all active alerts.

7. Rules

GET /api/v1/rules

Lists all recording and alerting rules.

8. Status Config

GET /api/v1/status/config

Dumps the current Prometheus configuration.

9. Metadata

GET /api/v1/metadata

Returns metadata about metrics (helpful for understanding units and semantics).

💡

Explore how to make the most of PromQL with this guide on Prometheus functions for better queries and insights.

Working with PromQL Through the API

The real magic happens when you combine the API with PromQL queries.

Here's a comprehensive chart of essential query patterns that every DevOps engineer should know:

Query Type	Example	Use Case	Notes
Simple	`http_requests_total`	Basic metric retrieval	Returns all time series with this name
Counter Rate	`rate(http_requests_total[5m])`	Traffic patterns	Per-second rate calculated over 5m
Counter Increase	`increase(http_requests_total[1h])`	Hourly totals	Total increase over the time period
Gauge Current	`node_memory_MemFree_bytes`	Current state	Point-in-time value
Gauge Aggregation	`avg_over_time(node_memory_MemFree_bytes[1h])`	Stable representation	Smooths fluctuations
Sum	`sum(node_cpu_seconds_total)`	Resource utilization	Total across all instances
By	`sum by (instance) (up)`	Grouped metrics	Aggregation with dimensions
Without	`sum without (job) (up)`	Remove dimensions	Simplify output
Offset	`rate(http_requests_total[5m] offset 1h)`	Comparison with past	Historical data points
Delta	`delta(cpu_temp_celsius[2h])`	Change detection	For gauges (vs rate for counters)
Topk	`topk(3, cpu_usage)`	Hotspot identification	Find highest values
Bottomk	`bottomk(3, up)`	Problem detection	Find lowest values
Quantile	`histogram_quantile(0.95, http_request_duration_seconds_bucket)`	SLO tracking	Calculate percentiles
Prediction	`predict_linear(node_filesystem_free_bytes[6h], 24 * 3600)`	Capacity planning	Predict future values
Resets	`resets(counter[5m])`	Service restarts	Detect counter resets
Time Functions	`http_requests_total offset 1d`	Day-over-day comparison	Compare to same time yesterday
Label Matching	`http_requests_total{status=~"5..", method!="POST"}`	Filtering	Multiple conditions with regex
Binary Operators	`node_memory_MemTotal_bytes - node_memory_MemFree_bytes`	Derived metrics	Arithmetic between metrics
Boolean	`node_filesystem_free_bytes / node_filesystem_size_bytes < 0.10`	Threshold checks	Returns 0 or 1

Practical PromQL Examples

Let me break down some practical examples you'll use:

1. Error Rate Calculation

sum(rate(http_requests_total{status_code=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))

This query calculates your error rate – the percentage of requests returning 5xx errors. Super useful for SLOs.

2. Container Memory Usage by Pod

sum by (pod) (container_memory_working_set_bytes{namespace="production"})

Shows memory consumption grouped by pod name in your production namespace.

3. CPU Throttling Detection

sum by (pod) (rate(container_cpu_cfs_throttled_seconds_total[5m])) / sum by (pod) (rate(container_cpu_cfs_periods_total[5m])) > 0.1

Identifies pods experiencing more than 10% CPU throttling, indicating they need more resources.

4. Disk Space Prediction

predict_linear(node_filesystem_free_bytes{mountpoint="/"}[6h], 24 * 3600 * 7)

Predicts free disk space in 7 days based on the trend over the last 6 hours.

5. Apdex Score (Application Performance)

(sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) + sum(rate(http_request_duration_seconds_bucket{le="1.2"}[5m])) / 2) / sum(rate(http_request_duration_seconds_count[5m]))

Calculates an Apdex score where requests under 0.3s are "satisfied" and under 1.2s are "tolerating".

💡

For a deeper understanding of PromQL, check out this guide to Prometheus Query Language and level up your queries.

5 Common PromQL Mistakes

When crafting these queries, watch out for these common pitfalls:

Missing rate() for counters - Counters always increase; you almost always want the rate
Incorrect time windows - Too small windows make noisy data, too large miss important spikes
Missing label context - Aggregating without considering cardinality explosion
Forgetting by() in division - Division between vectors needs matching labels
Unescaped regex characters - Remember to escape special characters in label matches

Advanced PromQL Tips You Need to Know

For more complex monitoring needs:

Create recording rules for complex queries: Recording rules pre-compute expensive expressions, making dashboards faster.

Use absent() to detect missing metrics:

absent(up{job="critical-service"})

Returns 1 if the metric doesn't exist (service is down).

Use subqueries for moving averages:

avg_over_time(rate(http_requests_total[5m])[1h:5m])

This gives you a smoothed rate calculated every 5 minutes over a sliding 1-hour window.

Common API Integration Patterns Worth Knowing

Grafana Integration

Grafana already works with Prometheus out of the box, but you can extend this with custom API calls through Grafana's data source plugins or visualization panels:

// Example fetch in a Grafana panel
async function queryPrometheus(query) {
  const response = await fetch(`http://prometheus:9090/api/v1/query?query=${encodeURIComponent(query)}`);
  const data = await response.json();
  
  if (data.status !== 'success') {
    throw new Error(`Query failed: ${data.error || 'Unknown error'}`);
  }
  
  return data.data.result;
}

// Example usage in a Grafana panel
const metricData = await queryPrometheus('sum(rate(http_requests_total[5m]))');
// Custom visualization logic using D3.js or other libraries

You can also use Grafana's Prometheus data source with variables for dynamic dashboards:

sum by (service) (rate(http_requests_total{environment="$env", datacenter="$dc"}[5m]))

Where $env and $dc are Grafana template variables that users can change.

CI/CD Pipeline Integration

Want to verify your deployment didn't break things? Check it with an API call in your deployment pipeline:

#!/bin/bash
# progressive_deployment.sh

# Deploy the new version to a canary environment
kubectl apply -f canary-deployment.yaml

# Wait for the deployment to stabilize
sleep 60

# Check error rate for the canary version
ERROR_RATE=$(curl -s -H "Authorization: Bearer $PROM_TOKEN" \
  'http://prometheus:9090/api/v1/query?query=sum(rate(http_requests_total{version="canary",status_code=~"5.."}[5m]))/sum(rate(http_requests_total{version="canary"}[5m]))*100' \
  | jq '.data.result[0].value[1] // "0"' \
  | tr -d '"')

# Check latency for the canary version
P95_LATENCY=$(curl -s -H "Authorization: Bearer $PROM_TOKEN" \
  'http://prometheus:9090/api/v1/query?query=histogram_quantile(0.95,sum(rate(http_request_duration_seconds_bucket{version="canary"}[5m]))by(le))' \
  | jq '.data.result[0].value[1] // "0"' \
  | tr -d '"')

# Evaluate if the deployment meets SLOs
if (( $(echo "$ERROR_RATE > 1.0" | bc -l) )) || (( $(echo "$P95_LATENCY > 0.3" | bc -l) )); then
  echo "Canary deployment failed SLO checks!"
  echo "Error rate: $ERROR_RATE% (threshold: 1.0%)"
  echo "P95 latency: ${P95_LATENCY}s (threshold: 0.3s)"
  
  # Rollback the canary deployment
  kubectl delete -f canary-deployment.yaml
  exit 1
else
  echo "Canary deployment passed SLO checks!"
  echo "Error rate: $ERROR_RATE% (threshold: 1.0%)"
  echo "P95 latency: ${P95_LATENCY}s (threshold: 0.3s)"
  
  # Promote canary to production
  kubectl apply -f production-deployment.yaml
fi

This script promotes a canary deployment only if error rates and latency meet your SLOs.

Custom Alerting Logic

Sometimes you need alerts based on complex conditions that aren't easily expressed in standard alerting rules:

#!/usr/bin/env python3
# advanced_alerting.py

import requests
import time
import smtplib
from email.message import EmailMessage
import logging
import os
from datetime import datetime, timedelta

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger('prometheus_alerts')

# Configuration
PROMETHEUS_URL = os.environ.get('PROMETHEUS_URL', 'http://prometheus:9090')
CHECK_INTERVAL = int(os.environ.get('CHECK_INTERVAL', 60))  # seconds
ALERT_COOLDOWN = int(os.environ.get('ALERT_COOLDOWN', 3600))  # seconds
RECIPIENTS = os.environ.get('ALERT_RECIPIENTS', '').split(',')
SMTP_SERVER = os.environ.get('SMTP_SERVER', 'smtp.example.com')

# Alert state management
last_alerts = {}

def query_prometheus(query):
    """Execute a PromQL query against the Prometheus API."""
    try:
        response = requests.get(
            f"{PROMETHEUS_URL}/api/v1/query",
            params={'query': query},
            timeout=10
        )
        response.raise_for_status()
        result = response.json()
        
        if result['status'] != 'success':
            logger.error(f"Query failed: {result.get('error', 'Unknown error')}")
            return None
            
        return result['data']['result']
    except Exception as e:
        logger.exception(f"Error querying Prometheus: {e}")
        return None

def check_business_hours():
    """Only alert during business hours."""
    now = datetime.now()
    # Monday-Friday, 9 AM to 5 PM
    return now.weekday() < 5 and 9 <= now.hour < 17

def check_conditions():
    """Check for complex alert conditions."""
    conditions = [
        # High error rate with high traffic
        {
            'name': 'high_error_rate',
            'query': 'sum(rate(http_requests_total{status_code=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05 and sum(rate(http_requests_total[5m])) > 10',
            'message': 'High error rate detected with significant traffic',
            'severity': 'critical',
            'runbook': 'https://wiki.example.com/runbooks/high-error-rate'
        },
        # Database connection saturation
        {
            'name': 'db_connection_saturation',
            'query': 'max(pg_stat_activity_count) / max(pg_settings_max_connections) > 0.8',
            'message': 'Database connection pool nearing saturation',
            'severity': 'warning',
            'runbook': 'https://wiki.example.com/runbooks/db-connection-pool'
        },
        # Correlated conditions: both API latency and DB latency high
        {
            'name': 'service_degradation',
            'query': 'histogram_quantile(0.95, sum(rate(api_request_duration_seconds_bucket[5m])) by (le)) > 1 and histogram_quantile(0.95, sum(rate(db_query_duration_seconds_bucket[5m]))

## Performance Tips for Heavy API Users

When you're making lots of API calls, keep these tips in mind:

1. **Use query_range wisely** – Specify reasonable step values
2. **Cache common queries** – Don't hammer the API with the same requests
3. **Be selective with labels** – The more labels, the bigger the response
4. **Batch related queries** – Reduce network overhead
5. **Consider federation** – For multi-cluster setups

## API Limitations and Workarounds

Let's be honest about some Prometheus API constraints:

### Time Range Limits

The API can get sluggish with very large time ranges. Break these into smaller chunks:

```python
# Instead of one big query
def get_data_in_chunks(query, start_time, end_time, chunk_hours=6):
    all_data = []
    current = start_time
    
    while current < end_time:
        chunk_end = min(current + chunk_hours * 3600, end_time)
        # API call for just this chunk
        chunk_data = get_prometheus_data(query, current, chunk_end)
        all_data.extend(chunk_data)
        current = chunk_end
        
    return all_data

Rate Limiting

Some environments put limits on API calls. Implement backoff logic:

def api_call_with_backoff(url, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url)
        if response.status_code == 429:  # Too Many Requests
            sleep_time = 2 ** attempt
            time.sleep(sleep_time)
        else:
            return response
    raise Exception("Max retries exceeded")

Security Best Practices

Your Prometheus API is a window into your system's health – protect it:

Never expose it directly to the internet – Use a proxy or API gateway
Implement proper authentication – Basic Auth is the minimum
Use TLS everywhere – Encrypt all API traffic
Apply RBAC – Limit who can access what data
Audit API access – Track who's viewing your metrics

Tying It All Together

The Prometheus API transforms passive monitoring into active observability. By programmatically accessing your metrics, you can build automated responses to system conditions, create custom visualizations, and integrate monitoring into your workflow.

💡

Want to scale Prometheus efficiently? Check out this guide on Thanos and how it extends Prometheus' capabilities.

Last9 and Prometheus

Last9 integrates with Prometheus to enhance your monitoring experience. It connects directly to your Prometheus API, organizing metrics and turning complex data into clear, intuitive visualizations.

With Last9’s Prometheus integration, you can easily spot patterns across your infrastructure and applications—no need to wrestle with complex queries. Get the insights you need, when you need them.

Let’s talk about how we make observability simpler.

💡

If you've any questions about using the Prometheus API in your setup or are stuck on a particular integration join our Discord community where engineers share tips, tricks, and practical solutions.

An Easy and Comprehensive Guide to Prometheus API

Contents

What Makes the Prometheus API Worth Your Time?

Practical API Use Cases

Getting Started with the Prometheus API

Base URL Structure

Response Format Details

Authentication Options

Key Prometheus Endpoints You'll Use

1. Query Instant Data

2. Query Range Data

3. Series Metadata

4. Label Values

5. Targets

Additional Useful Endpoints

6. Alerts

7. Rules

8. Status Config

9. Metadata

Working with PromQL Through the API

Practical PromQL Examples

1. Error Rate Calculation

2. Container Memory Usage by Pod

3. CPU Throttling Detection

4. Disk Space Prediction

5. Apdex Score (Application Performance)

5 Common PromQL Mistakes

Advanced PromQL Tips You Need to Know

Common API Integration Patterns Worth Knowing

Grafana Integration

CI/CD Pipeline Integration

Custom Alerting Logic

Rate Limiting

Security Best Practices

Tying It All Together

Last9 and Prometheus

Contents

Do More with Less

Handcrafted Related Posts

New in OTel: How Prometheus 3.0 Fixes Resource Attributes for OTel Metrics

How sum_over_time Works in Prometheus

Use Telegraf Without the Prometheus Complexity