Cerebrium
Export OpenTelemetry metrics from Cerebrium serverless GPU apps to Last9 for resource, GPU, and execution observability
Send resource, GPU, and execution metrics from your Cerebrium serverless GPU applications to Last9. Cerebrium pushes metrics natively over OTLP/HTTP every 60 seconds — no SDK code, no sidecar, just dashboard configuration.
What is Cerebrium?
Cerebrium is a serverless GPU infrastructure platform for real-time AI workloads. It autoscales containers in 1–3 seconds and bills per second of compute. Cerebrium has built-in OpenTelemetry metric export to any OTLP/HTTP backend.
Prerequisites
- Last9 Account — Sign up at app.last9.io
- Cerebrium Account — With at least one deployed application
Integration Setup
-
Get Your Last9 OTLP Endpoint and Auth Header
Navigate to Integrations → OpenTelemetry in your Last9 dashboard. Copy the OTLP Endpoint (the base URL — for example,
https://otlp-aps1.last9.io) and the Auth Header value (a string starting withBasic). -
Open Cerebrium’s Metrics Export Settings
In the Cerebrium dashboard, go to Integrations → Metrics Export, then select Custom OTLP.
-
Configure the Endpoint and Auth Header
Enter the following values:
Field Value OTLP Endpoint https://otlp-aps1.last9.io(base URL — see note below)Auth Header Name AuthorizationAuth Header Value Basic <token>(paste the full Auth Header from Last9, including theBasicprefix with a literal space) -
Test the Connection
Click Test Connection in the Cerebrium dashboard. Once it passes, metrics begin flowing within ~60 seconds.
What Gets Exported
Cerebrium emits the following metrics every 60 seconds, labelled with project_id, app_id, app_name, and region.
Resource Metrics
| Metric | Type | Description |
|---|---|---|
cerebrium_cpu_utilization_cores | Gauge | CPU cores actively in use per app |
cerebrium_memory_usage_bytes | Gauge | Memory actively in use per app |
cerebrium_gpu_memory_usage_bytes | Gauge | GPU VRAM in use per app |
cerebrium_gpu_compute_utilization_percent | Gauge | GPU compute utilization (0–100) per app |
cerebrium_containers_running_count | Gauge | Number of running containers per app |
cerebrium_containers_ready_count | Gauge | Number of ready containers per app |
Execution Metrics
| Metric | Type | Description |
|---|---|---|
cerebrium_run_execution_time_ms | Histogram | Time spent executing user code |
cerebrium_run_queue_time_ms | Histogram | Time spent waiting in queue |
cerebrium_run_coldstart_time_ms | Histogram | Time for container cold start |
cerebrium_run_response_time_ms | Histogram | Total end-to-end response time |
cerebrium_run_total | Counter | Total run count |
cerebrium_run_successes_total | Counter | Successful run count |
cerebrium_run_errors_total | Counter | Failed run count |
Viewing Metrics in Last9
Open Metrics Explorer and filter by app_name or project_id. Useful starter queries:
# GPU utilization per appavg by (app_name) (cerebrium_gpu_compute_utilization_percent)
# p95 execution latencyhistogram_quantile(0.95, sum by (le, app_name) (rate(cerebrium_run_execution_time_ms_milliseconds_bucket[5m])))
# Error rate per appsum by (app_name) (rate(cerebrium_run_errors_total[5m])) / sum by (app_name) (rate(cerebrium_run_total[5m]))
# Running container countsum by (app_name) (cerebrium_containers_running_count)Sending Traces
Cerebrium’s platform-side OTLP push covers metrics only. To get application traces (per-request spans, downstream HTTP calls, LLM provider latency) into Last9, run opentelemetry-instrument in your Cerebrium entrypoint — no code changes required in your application.
-
Add the OTel packages to
[cerebrium.dependencies.pip]in yourcerebrium.toml.[cerebrium.dependencies.pip]opentelemetry-distro = ">=0.51b0"opentelemetry-exporter-otlp = ">=1.30.0"opentelemetry-semantic-conventions = ">=0.51b0"Pin all three together. Without consistent versions, the newer exporter imports
semconv._incubatingagainst an older semantic-conventions wheel and startup fails withModuleNotFoundError. -
Install instrumentors during build via
shell_commands.[cerebrium.deployment]shell_commands = ["opentelemetry-bootstrap --action=install"]This detects every supported library in your pip deps (FastAPI, requests, httpx, OpenAI, Anthropic, SQLAlchemy, etc.) and installs the matching
opentelemetry-instrumentation-*packages. -
Wrap your entrypoint with
opentelemetry-instrument.[cerebrium.runtime.custom]port = 8000entrypoint = ["opentelemetry-instrument","uvicorn", "main:app","--host", "0.0.0.0", "--port", "8000",] -
Set OTel env vars as Cerebrium secrets.
cerebrium secrets add OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-aps1.last9.iocerebrium secrets add OTEL_EXPORTER_OTLP_PROTOCOL=http/protobufcerebrium secrets add OTEL_SERVICE_NAME=my-cerebrium-appcerebrium secrets add 'OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic <your-token>'Note the inner
=inOTEL_EXPORTER_OTLP_HEADERS— that is the OTel env-var convention for<header-name>=<header-value>, not a typo. -
Deploy. Traces appear in Traces Explorer within seconds, filtered by
service.name.
A runnable, verified example is available at last9/opentelemetry-examples — python/cerebrium/.
Troubleshooting
401 / 403 Unauthorized— The Auth Header value is malformed. Use a literal space betweenBasicand the token (not%20, not=). Confirm Name isAuthorizationand Value starts withBasic.404 Endpoint not found— The endpoint URL has/v1/metricsappended. Remove it; Cerebrium appends it automatically.Unknown export error— Often produced when the endpoint resolves but the path does not match Last9’s/v1/metricsroute. Confirm you are using the base URL from your Last9 OpenTelemetry integration page and that no extra suffix or trailing slash is present.- No metrics arriving after
Test Connectionpasses — Metrics push every 60 seconds. Wait ~2 minutes, then filter byapp_namein Metrics Explorer. ModuleNotFoundError: No module named 'opentelemetry.semconv._incubating'at container startup — Version skew betweenopentelemetry-exporter-otlp-proto-commonandopentelemetry-semantic-conventions. Pin all three OTel packages together as shown in step 1 above.Failed to auto initialize opentelemetryin container logs — Auto-instrumentation crashed at sitecustomize. Check Cerebrium build logs for theopentelemetry-bootstrapoutput and confirm instrumentors installed cleanly.- App starts and serves traffic, but no traces in Last9 — Either the OTel env-var secrets are unset on this app, or the auth header is malformed. Verify with
cerebrium secrets listthat all fourOTEL_*secrets are present, and thatOTEL_EXPORTER_OTLP_HEADERSuses theAuthorization=Basic <token>form (not just the token).
Please get in touch with us on Discord or Email if you have any questions.