Skip to content
Last9
Book demo

Agents Monitoring

Monitor LLM and agent usage across your applications — conversations, token consumption, latency, cost, success rates, and active models — built on OpenTelemetry GenAI semantic conventions

Agents Monitoring gives you a single view of how your applications use large language models and AI agents: how many conversations they run, what they cost, how long calls take, how often they fail, and which models are actually in use.

Agents Monitoring overview

The view is built on the OpenTelemetry GenAI semantic conventions. Any instrumentation that emits GenAI-convention telemetry feeds it — no Last9-specific SDK required.

Overview

Open Agents Monitoring under the AI section of the sidebar. The Overview tab reports, for the selected environment and time range:

CardWhat it shows
ConversationsTotal conversations and average messages per conversation
Conversation CostAverage cost per conversation
LatencyP50, P90, P95, and P99 call duration
Total LLM costSpend across all calls in the range
SpansTotal LLM spans recorded
Token countTotal tokens consumed
Success rateSuccess and error rate across all requests
Top ModelsModels serving traffic, with request counts and share
Model InsightsOverall performance, cost efficiency, token usage, model diversity

The Conversations tab lists individual conversations, and the Traces tab shows the underlying LLM spans — a slow or failing call links to the full request trace: the endpoint that triggered it, retries, and downstream work.

Metrics

The overview is backed by GenAI semantic-convention metrics in your Last9 workspace — gen_ai_client_token_usage_total for tokens and the gen_ai_client_operation_duration_seconds histogram for request counts and latency, with gen_ai_request_model and error.type labels. The same series are queryable with PromQL in dashboards, alerts, and ad-hoc queries.

Getting Data In

Instrument your application with any library that emits GenAI semantic-convention telemetry. See the AI integrations for supported options, including the Python GenAI SDK for LLM observability in Python applications.


Troubleshooting

  • The overview is empty: no GenAI-convention telemetry has arrived yet. The empty state links to the integrations to set one up. Verify your instrumentation exports gen_ai_* metrics by checking the metrics explorer for gen_ai_client_token_usage_total.

Please get in touch with us on Discord or Email if you have any questions.