Code in production will break. Maybe a request takes too long, maybe it fails quietly, or maybe it works fine one minute and falls over the next. Logs can help, sure—but they don’t always show the full picture, especially when performance issues are involved.
Elastic APM gives you a clearer view. It traces what your application is doing from incoming requests to database queries and everything in between. You get latency breakdowns, error details, and performance metrics tied directly to the code paths that caused them.
This blog walks through how it works, what kind of data you get, and how to set it up without adding unnecessary overhead.
What is Elastic APM?
Elastic APM is a performance monitoring system that’s part of the Elastic Stack. It helps you understand how your code behaves in production by collecting performance metrics, traces, and error data from your applications in real time.
Unlike traditional monitoring, which might just tell you CPU usage or uptime, Elastic APM goes deeper. It tracks individual requests, highlights slow database queries, and captures stack traces for errors—so you can see where and why things are going wrong.
It works by adding language-specific agents to your app. These agents gather performance data and send it to the APM Server, which then pushes it into Elasticsearch. From there, you can explore it using Kibana’s dashboards or build your views.
Breakdown of Elastic APM Components
APM Agents
APM agents are libraries you add to your app to collect telemetry—no separate daemon or sidecar needed. Elastic supports Java, .NET, Node.js, Python, Go, Ruby, and PHP.
Once integrated, agents automatically capture:
- Transactions – like HTTP requests or background jobs
- Spans – internal operations like DB queries or external API calls
- Errors – uncaught exceptions, stack traces, and related metadata
Most agents work with minimal setup—often just a few lines of code to start collecting data.
APM Server
This sits between your app and Elasticsearch. Agents send raw data to the APM Server, which validates, enriches, and forwards it.
You can run the server standalone, in Docker, Kubernetes, or as part of Elastic Cloud. It’s built to handle large volumes of trace data without introducing extra latency.
Elasticsearch + Kibana
All your APM data ends up in Elasticsearch. You get full-text search, filtering, and aggregation on traces, spans, and errors—so you can dig deep without writing complex queries.
Kibana is the UI layer. Use the built-in APM dashboards or build your own to slice data by service, endpoint, latency, and more. Great for debugging, and even better for showing your team why something is slow.
Key Features That Make Elastic APM Stand Out
Distributed Tracing
Following a single request across multiple services is painful without proper tooling. Elastic APM makes this easier by stitching together the entire trace, from entry point to final response.
- Visualizes the full path of a request across services
- Breaks down latency by span, so you can isolate bottlenecks
- Makes debugging slow or inconsistent endpoints much faster
Real User Monitoring (RUM)
Server metrics don’t tell you how fast the app feels. RUM captures what users experience in the browser.
- Tracks page load times, asset performance, and frontend errors
- Works across browsers, devices, and network conditions
- Helps connect backend behavior to frontend impact
Error Tracking and Stack Traces
When things break, Elastic APM doesn’t just log the error; it adds context.
- Captures full stack traces and exception details
- Group recurring errors to cut through the noise
- Includes session, environment, and request data to speed up debugging
Database and External Service Monitoring
Dependencies are often where performance issues hide. Elastic APM keeps an eye on them too.
- Automatically tracks calls to databases and third-party services
- Surfaces slow queries, failed requests, and timeout patterns
- Supports out-of-the-box instrumentation for common frameworks
How to Get Started with Elastic APM
Setting up Elastic APM involves a few moving parts, but it’s fairly straightforward once you break it down. You’ll need to:
- Start an APM Server
- Install and configure an APM agent in your app
- Make sure everything connects correctly
Let’s understand each step.
1. Deploy the APM Server
The APM Server acts as the middleman between your application and Elasticsearch. It receives telemetry data from your app (via the agent), processes it, and forwards it to Elasticsearch for storage and analysis.
You can run the APM Server in a few ways:
- Locally (useful for testing or dev environments)
- Docker
- Kubernetes
- Elastic Cloud (fully managed by Elastic)
Here’s a quick way to start it using Docker:
docker run \
--name=apm-server \
--env=ELASTIC_APM_SERVER_URL=http://localhost:8200 \
-p 8200:8200 \
docker.elastic.co/apm/apm-server:8.13.0
You’ll also need Elasticsearch and Kibana running. If you’re testing locally, Elastic provides a docker-compose file that spins everything up at once.
2. Install the APM Agent in Your Application
The APM agent is a small library you add to your codebase. It automatically instruments common frameworks and libraries, captures performance data, and sends it to the APM Server.
For example, in a Node.js app:
Install the agent:
npm install elastic-apm-node
Then, add this as the very first line in your main entry file (before any other imports):
const apm = require('elastic-apm-node').start({
serviceName: 'my-node-app',
serverUrl: 'http://localhost:8200',
environment: 'development'
})
Replace serviceName
with something meaningful (like checkout-service
) and point serverUrl
to wherever your APM Server is running.
3. Verify the Data Flow
Once your app is running, the agent should start sending data to the APM Server. To confirm:
- Open Kibana
- Go to APM > Services
- You should see your app listed
- Click into it to view traces, transactions, errors, and dependencies
If nothing shows up:
- Check APM Server logs for connection issues
- Make sure you’ve restarted the app after adding the agent
- Double-check that the APM Server URL is accessible
Configuration Best Practices
You’ll get data with the default setup, but here are a few tweaks to make it production-ready:
- Sampling rate: For high-traffic apps, set a transaction sampling rate (e.g.,
0.2
= 20%) to reduce noise and resource usage. - Consistent naming: Use predictable service names (
auth-service
,user-api
) and environments (staging
,prod
) to make filtering easier. - Secure your APM Server: In production, always enable HTTPS and authentication on your APM Server.
- Custom labels/tags: Add metadata (e.g., feature flags, deployment IDs, region) to traces for better context when debugging.
Monitoring Multiple Services
If you're running a distributed system or microservices, Elastic APM can trace requests across multiple services using a shared trace context.
- Make sure all services use APM agents that support distributed tracing.
- Use consistent headers (
traceparent
) across HTTP calls to propagate trace info. - Use service maps in Kibana to visualize how services are connected and identify slow hops between them.
This is especially useful when you're trying to understand performance issues that span multiple backends or dependencies.
Advanced Elastic APM Features
Once your basic setup is working, you’ll probably want to get more value out of the data you're collecting.
This section covers the kind of advanced features that help you debug faster, track key workflows, and even identify problems before users do.
Frontend Debugging: Using Source Maps
If you're shipping minified JavaScript (which you probably are), then your stack traces in production won't make much sense. Source maps fix that by linking the minified code back to your original files, so instead of a stack trace pointing to bundle.js:2:20493
, you’ll see the actual line in your source code that caused the error.
To get this working in Elastic APM:
- Upload the maps to Elastic APM (there’s a CLI for this, or you can use their API).
Configure your build tool (Webpack, Vite, etc.) to generate source maps:
devtool: 'source-map',
Start by adding version tracking to your RUM setup:
import { init as initApm } from '@elastic/apm-rum';
const version = require('./package.json').version;
const apm = initApm({
serviceName: 'my-web-app',
serviceVersion: version
});
Once it’s set up, your frontend errors in Kibana will show clean, readable stack traces that point to your source files, not your minified bundles.
Custom Instrumentation: When You Need More Than Defaults
The built-in agents track most of the common stuff: HTTP routes, database calls, and exceptions. But what if you want to track the duration of your signup flow? Or monitor how long it takes to process an image upload?
You can create your transactions and spans using the APM agent's API. Here’s an example in Node.js:
const tx = apm.startTransaction('signup-flow');
const validation = tx.startSpan('email-validation');
// ... validate email
validation.end();
const write = tx.startSpan('insert-user-to-db');
// ... write to DB
write.end();
tx.end();
Now you’ve got a custom trace in Kibana showing exactly how long each part of the signup flow takes. It’s useful for spotting slow logic or debugging intermittent slowdowns.
Alerting and Anomaly Detection
Once you’ve got good data flowing into Elastic, setting up alerts is the next step. You don’t want to manually check dashboards to know something’s wrong.
You can configure alerts for:
- High error rates
- Spikes in response time
- Slow database queries
- Dropping throughput
Elastic also offers anomaly detection powered by machine learning. That can help catch things like gradually increasing response times that might not trigger a static threshold, but are still a problem.
These alerts can be routed to Slack, PagerDuty, or wherever your team tracks issues.
CI/CD Integration: Catching Regressions Before Users Do
Performance issues often show up right after a deploy. With Elastic APM, you can bake performance checks into your release pipeline.
Some teams do things like:
- Compare key metrics before and after a deploy
- Fail a build if latency jumps or error rates go up
- Use APM data to trigger automatic rollbacks
You can query Elasticsearch via scripts or APIs to pull the data you care about, and wire that into whatever CI/CD system you use.

How Much Does APM Cost in Resources
Instrumenting your application for observability always comes with a trade-off—you're collecting useful data, but that data has to be gathered, processed, and shipped somewhere. So, how much overhead does Elastic APM introduce?
Agent Overhead
Elastic APM agents are built to be lightweight. Most setups see less than a 5% increase in CPU usage and little to no impact on memory. That’s because:
- Agents use non-blocking, asynchronous calls to send data
- Sampling is enabled by default, so not every transaction is traced
- Most agents are smart enough to avoid redundant instrumentation
You can tune the configuration based on your traffic patterns. If you're running a high-throughput system, you may want to lower the sample rate further or disable some integrations to reduce overhead.
Managing Data Volume
APM data can grow fast, especially in production environments with lots of traffic. If you’re not careful, you’ll end up storing huge volumes of trace and span data that no one is using.
Here’s how to keep it under control:
- Sampling: Set a sampling rate that balances visibility with cost. For example, sampling 10–20% of transactions is often enough to catch most issues.
- Retention policies: Decide how long to keep full-fidelity data vs. aggregated metrics. You might keep detailed traces for 7 days and roll up metrics beyond that.
- Index lifecycle management (ILM): Elasticsearch supports ILM policies to automate data aging—move older data to cheaper storage or delete it entirely.
The idea is to prioritize what’s actionable. If you're only checking detailed traces when an incident happens, keep them short-lived. Long-term trend data can stick around for capacity planning or SLA reviews.

Fixing Common Elastic APM Setup and Runtime Issues
1. When Agents Can’t Reach the APM Server
One of the first things to check during setup is network connectivity between your application and the APM Server. If the agent can’t send data, nothing else works. Make sure:
- The
serverUrl
in your agent config, is correct. - The port (usually
8200
) is open and reachable. - If you're using HTTPS, verify your SSL certificates—self-signed certs in dev environments are often a pain point.
Enabling debug logs in the agent can reveal whether it's failing silently or dropping data due to network errors.
2. No Data in Kibana? Here’s What to Check
Once agents are running, you expect to see data in Kibana. If nothing’s there, don’t panic. A few common causes:
- Sampling rate is too low: Especially in test environments with light traffic, a low sample rate might mean you’re just not seeing enough transactions.
- Service name mismatches: Even small inconsistencies (
myService
vsmy-service
) can cause data to appear in unexpected places, or not at all. - Agent misconfiguration: Missing environment variables or typos in the config file can silently break data collection.
3. APM Slowing Down Your App? Here's How to Dial It Back
APM is supposed to help you debug performance but if you notice latency or CPU spikes after adding an agent, it’s time to adjust your configuration.
- Reduce sampling rate for high-throughput services.
- Disable modules you don’t need—many agents auto-instrument everything by default.
- Filter out noisy spans like health checks or cache lookups that don’t add value.
Also, keep an eye on compatibility if you’re using legacy frameworks. Some older libraries may not play well with default instrumentation settings.
Elastic APM Integration Strategies
Integration Type | Benefits | Considerations |
---|---|---|
Container Deployments | Easy scaling, consistent configuration | Requires container orchestration knowledge |
Cloud Services | Managed infrastructure, automatic scaling | Vendor lock-in, data transfer costs |
Hybrid Environments | Flexibility, gradual migration | Complex networking, security considerations |
Development Environments | Early issue detection, performance testing | Resource usage, configuration management |
Practical Tips to Get More Out of Elastic APM
Elastic APM can be incredibly useful, but only if it’s implemented in a way that your team uses. The goal isn’t just data collection; it’s better decisions, faster debugging, and fewer surprises in production.
Here’s how to get there.
Begin by instrumenting your core services and critical user journeys, things like authentication, checkout flows, or search endpoints. These areas tend to have the biggest impact on user experience.
Once you’re comfortable with the basics, expand coverage to less critical features. You can gradually enable advanced options like custom spans, user context, or release tracking without overwhelming your team early on.
To make the most of the data you collect:
- Create alerting workflows: Knowing something’s wrong is only half the battle. Build lightweight runbooks or Slack playbooks for what to do when response times spike or error rates increase.
- Document your tagging conventions: This helps avoid data silos and keeps dashboards clean as your service footprint grows.
- Review sampling and retention regularly: What mattered six months ago might be noise now. Keep your config aligned with your app’s current priorities.
And don’t forget to:
- Train your team: APM only helps if engineers use it. Host walkthroughs, record short Loom videos, or pair program while reviewing traces.
- Use the data during postmortems: Point to spans and transaction traces in retro meetings to build stronger context and more accurate root cause analysis.
Wrapping Up
Elastic APM gives you solid tooling to understand what’s going on inside your application—tracing requests, surfacing slow database queries, and helping you connect the dots when performance dips. It's a good choice if you're already in the Elastic ecosystem and don’t mind managing your infrastructure.
But if you're scaling fast, dealing with high-cardinality metrics, or just want something that works out of the box without the overhead of managing servers, agents, and storage, Last9 might be a better fit.
We offer a managed observability platform that handles metrics, logs, and traces without you having to wrestle with config files and retention settings. It's built for engineers who’d rather spend time improving systems than debugging their monitoring stack.
Get started with a free trial or book sometime with us to know more about the platform!
FAQs
How much overhead does Elastic APM add to my application?
Elastic APM agents typically add less than 5% CPU overhead and minimal memory usage. The agents use efficient sampling and asynchronous data transmission to avoid impacting your application's performance.
Can I use Elastic APM with microservices?
Yes, Elastic APM excels at monitoring microservices architectures. Its distributed tracing capabilities follow requests across multiple services, helping you identify bottlenecks and understand service dependencies.
What's the difference between Elastic APM and traditional logging?
Traditional logging captures discrete events, while Elastic APM provides structured performance data with automatic instrumentation. APM shows you transaction flows, performance metrics, and user experience data that logs alone can't provide.
How long does it take to set up Elastic APM?
Basic setup takes about 30 minutes for most applications. You'll need to install APM Server, add an agent to your application, and configure data collection. More complex setups with custom instrumentation may take longer.
Do I need to modify my application code to use Elastic APM?
Most APM agents provide automatic instrumentation with minimal code changes. You typically need to add just a few lines of configuration code to start collecting basic performance data.
Can Elastic APM monitor database performance?
Yes, Elastic APM automatically detects and monitors database queries, showing you slow queries, connection issues, and database response times without additional configuration.
How does Elastic APM handle sensitive data?
APM agents can be configured to filter sensitive data before transmission. You can exclude specific fields, sanitize SQL queries, and control which data gets sent to your monitoring infrastructure.