Skip to content
Last9
Book demo

Cerebrium

Export OpenTelemetry metrics from Cerebrium serverless GPU apps to Last9 for resource, GPU, and execution observability

Send resource, GPU, and execution metrics from your Cerebrium serverless GPU applications to Last9. Cerebrium pushes metrics natively over OTLP/HTTP every 60 seconds — no SDK code, no sidecar, just dashboard configuration.

What is Cerebrium?

Cerebrium is a serverless GPU infrastructure platform for real-time AI workloads. It autoscales containers in 1–3 seconds and bills per second of compute. Cerebrium has built-in OpenTelemetry metric export to any OTLP/HTTP backend.

Prerequisites

  1. Last9 Account — Sign up at app.last9.io
  2. Cerebrium Account — With at least one deployed application

Integration Setup

  1. Get Your Last9 OTLP Endpoint and Auth Header

    Navigate to IntegrationsOpenTelemetry in your Last9 dashboard. Copy the OTLP Endpoint (the base URL — for example, https://otlp-aps1.last9.io) and the Auth Header value (a string starting with Basic).

  2. Open Cerebrium’s Metrics Export Settings

    In the Cerebrium dashboard, go to IntegrationsMetrics Export, then select Custom OTLP.

  3. Configure the Endpoint and Auth Header

    Enter the following values:

    FieldValue
    OTLP Endpointhttps://otlp-aps1.last9.io (base URL — see note below)
    Auth Header NameAuthorization
    Auth Header ValueBasic <token> (paste the full Auth Header from Last9, including the Basic prefix with a literal space)
  4. Test the Connection

    Click Test Connection in the Cerebrium dashboard. Once it passes, metrics begin flowing within ~60 seconds.

What Gets Exported

Cerebrium emits the following metrics every 60 seconds, labelled with project_id, app_id, app_name, and region.

Resource Metrics

MetricTypeDescription
cerebrium_cpu_utilization_coresGaugeCPU cores actively in use per app
cerebrium_memory_usage_bytesGaugeMemory actively in use per app
cerebrium_gpu_memory_usage_bytesGaugeGPU VRAM in use per app
cerebrium_gpu_compute_utilization_percentGaugeGPU compute utilization (0–100) per app
cerebrium_containers_running_countGaugeNumber of running containers per app
cerebrium_containers_ready_countGaugeNumber of ready containers per app

Execution Metrics

MetricTypeDescription
cerebrium_run_execution_time_msHistogramTime spent executing user code
cerebrium_run_queue_time_msHistogramTime spent waiting in queue
cerebrium_run_coldstart_time_msHistogramTime for container cold start
cerebrium_run_response_time_msHistogramTotal end-to-end response time
cerebrium_run_totalCounterTotal run count
cerebrium_run_successes_totalCounterSuccessful run count
cerebrium_run_errors_totalCounterFailed run count

Viewing Metrics in Last9

Open Metrics Explorer and filter by app_name or project_id. Useful starter queries:

# GPU utilization per app
avg by (app_name) (cerebrium_gpu_compute_utilization_percent)
# p95 execution latency
histogram_quantile(0.95, sum by (le, app_name) (rate(cerebrium_run_execution_time_ms_milliseconds_bucket[5m])))
# Error rate per app
sum by (app_name) (rate(cerebrium_run_errors_total[5m]))
/ sum by (app_name) (rate(cerebrium_run_total[5m]))
# Running container count
sum by (app_name) (cerebrium_containers_running_count)

Sending Traces

Cerebrium’s platform-side OTLP push covers metrics only. To get application traces (per-request spans, downstream HTTP calls, LLM provider latency) into Last9, run opentelemetry-instrument in your Cerebrium entrypoint — no code changes required in your application.

  1. Add the OTel packages to [cerebrium.dependencies.pip] in your cerebrium.toml.

    [cerebrium.dependencies.pip]
    opentelemetry-distro = ">=0.51b0"
    opentelemetry-exporter-otlp = ">=1.30.0"
    opentelemetry-semantic-conventions = ">=0.51b0"

    Pin all three together. Without consistent versions, the newer exporter imports semconv._incubating against an older semantic-conventions wheel and startup fails with ModuleNotFoundError.

  2. Install instrumentors during build via shell_commands.

    [cerebrium.deployment]
    shell_commands = ["opentelemetry-bootstrap --action=install"]

    This detects every supported library in your pip deps (FastAPI, requests, httpx, OpenAI, Anthropic, SQLAlchemy, etc.) and installs the matching opentelemetry-instrumentation-* packages.

  3. Wrap your entrypoint with opentelemetry-instrument.

    [cerebrium.runtime.custom]
    port = 8000
    entrypoint = [
    "opentelemetry-instrument",
    "uvicorn", "main:app",
    "--host", "0.0.0.0", "--port", "8000",
    ]
  4. Set OTel env vars as Cerebrium secrets.

    cerebrium secrets add OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-aps1.last9.io
    cerebrium secrets add OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
    cerebrium secrets add OTEL_SERVICE_NAME=my-cerebrium-app
    cerebrium secrets add 'OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic <your-token>'

    Note the inner = in OTEL_EXPORTER_OTLP_HEADERS — that is the OTel env-var convention for <header-name>=<header-value>, not a typo.

  5. Deploy. Traces appear in Traces Explorer within seconds, filtered by service.name.

A runnable, verified example is available at last9/opentelemetry-examples — python/cerebrium/.


Troubleshooting

  • 401 / 403 Unauthorized — The Auth Header value is malformed. Use a literal space between Basic and the token (not %20, not =). Confirm Name is Authorization and Value starts with Basic .
  • 404 Endpoint not found — The endpoint URL has /v1/metrics appended. Remove it; Cerebrium appends it automatically.
  • Unknown export error — Often produced when the endpoint resolves but the path does not match Last9’s /v1/metrics route. Confirm you are using the base URL from your Last9 OpenTelemetry integration page and that no extra suffix or trailing slash is present.
  • No metrics arriving after Test Connection passes — Metrics push every 60 seconds. Wait ~2 minutes, then filter by app_name in Metrics Explorer.
  • ModuleNotFoundError: No module named 'opentelemetry.semconv._incubating' at container startup — Version skew between opentelemetry-exporter-otlp-proto-common and opentelemetry-semantic-conventions. Pin all three OTel packages together as shown in step 1 above.
  • Failed to auto initialize opentelemetry in container logs — Auto-instrumentation crashed at sitecustomize. Check Cerebrium build logs for the opentelemetry-bootstrap output and confirm instrumentors installed cleanly.
  • App starts and serves traffic, but no traces in Last9 — Either the OTel env-var secrets are unset on this app, or the auth header is malformed. Verify with cerebrium secrets list that all four OTEL_* secrets are present, and that OTEL_EXPORTER_OTLP_HEADERS uses the Authorization=Basic <token> form (not just the token).

Please get in touch with us on Discord or Email if you have any questions.