Your Java app slows down during peak load. A microservice crashes, but logs aren’t helpful. These aren’t rare events—they’re common signs something’s off inside the JVM.
For Java developers and DevOps teams, JVM metrics offer clues to what’s going on. This blog covers the key metrics to track, what they tell you, and how to use them to troubleshoot performance issues in a practical, no-nonsense way.
What Are JVM Metrics and Why Do They Matter?
JVM metrics are performance indicators that reveal how the Java Virtual Machine operates in real-time. They show what's happening under the hood of your applications.
These metrics serve as early warning systems for performance issues:
- They help detect bottlenecks before users notice slowdowns
- They provide diagnostic data when applications crash
- They inform capacity planning decisions
- They guide effective JVM tuning efforts
JVM metrics become even more critical in today's microservice architectures and containerized environments as resources are often constrained and applications scale dynamically.
Essential JVM Metrics You Should Monitor
Heap Memory Usage
The heap is where your Java objects live. Monitoring this area helps you catch memory leaks and determine if you've allocated enough memory.
Key heap metrics to watch:
- Used heap memory: How much memory your application is currently using
- Max heap size: The total memory available to your application
- Eden space usage: Where new objects are created
- Survivor space usage: Where objects go after surviving initial garbage collections
- Old generation usage: Where long-lived objects reside
Tracking heap usage patterns over time helps identify memory leaks, which appear as steadily increasing memory consumption that never plateaus.
// Example of retrieving heap memory metrics
MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
MemoryUsage heapUsage = memoryBean.getHeapMemoryUsage();
System.out.println("Used Heap: " + heapUsage.getUsed() / (1024 * 1024) + " MB");
System.out.println("Max Heap: " + heapUsage.getMax() / (1024 * 1024) + " MB");
Off-Heap Memory Usage
Off-heap memory refers to memory allocated outside the Java heap but still used by the JVM. Problems here won't trigger standard heap OutOfMemoryErrors but can still crash your application.
Areas to monitor include:
- Direct ByteBuffers: Used for efficient I/O operations
- Memory-mapped files: Used for reading large files
- Native code allocations: Memory used by JNI code
- Metaspace: Where class metadata is stored
Watch for steadily increasing off-heap memory usage, as this often indicates leaks in native memory that standard Java profilers might miss. Excessive off-heap memory can affect overall system performance even when heap metrics look normal.
Garbage Collection Metrics
GC metrics help you understand how the JVM manages memory and reveal tuning opportunities. In a healthy application, garbage collection runs efficiently without causing noticeable pauses.
Important GC metrics include:
- GC pause times: How long does your application freeze during collection
- GC frequency: How often collections occur
- GC throughput: Percentage of time not spent in GC
- Memory reclaimed per collection: Efficiency of garbage collection
- Collection counts by generation: Distribution of collection activity
A well-tuned JVM typically spends less than 5% of its time on garbage collection. When this percentage rises above 10%, it often indicates memory pressure that requires attention.
GC Metric | Healthy Range | Warning Signs |
---|---|---|
GC Pause Time | <100ms | >500ms pauses |
Time Spent in GC | <5% | >10% |
GC Frequency (Young Gen) | Variable | Sudden increases |
GC Frequency (Old Gen) | Rare | Multiple collections per hour |
- Collection frequency by generation
- Percentage of runtime spent in GC
- Memory recovered per collection cycle
- Collection count by memory region
A well-tuned JVM typically spends under 5% of its time performing garbage collection.
GC Metric | Healthy Range | Warning Signs |
---|---|---|
GC Pause Time | <100ms | >500ms pauses |
Time Spent in GC | <5% | >10% |
GC Frequency (Young Gen) | Variable | Sudden increases |
GC Frequency (Old Gen) | Rare | Multiple collections per hour |
Thread Metrics
Thread metrics help identify concurrency issues, thread leaks, and potential deadlocks. Problems in this area often lead to application freezes and poor responsiveness.
Key thread metrics to monitor:
- Thread count: Total number of threads in your application
- Runnable threads: Threads actively executing or ready to execute
- Blocked threads: Threads waiting for locks
- Waiting threads: Threads waiting for a condition to be met
- Thread CPU usage: CPU consumption by specific threads
A spike in blocked threads often indicates lock contention, while a steadily growing thread count may signal a thread leak. Either situation can eventually lead to performance degradation or application failure.
When troubleshooting thread issues, capture thread dumps during the problematic periods to identify specific synchronization points causing problems.
JVM Runtime Metrics
These metrics give you insights into the overall health of the JVM and provide context for other, more specific metrics.
Important runtime metrics include:
- CPU usage: How much processor time your Java application uses
- System load average: Overall demand on your server
- Open file descriptors: Track to prevent resource leaks
- Uptime: Duration since JVM started
- JIT compilation time: Efficiency of code optimization
- Class loading metrics: Rate of loading/unloading classes
High CPU usage combined with frequent garbage collection might indicate inefficient code, while high system load with normal JVM CPU usage might suggest resource contention from other processes.
Application-Specific Metrics
Beyond system metrics, instrument your code to track business-relevant measurements that connect technical performance to user experience.
Consider monitoring:
- Request rates: Volume of incoming requests
- Response times: How quickly requests are processed
- Error rates: Frequency of application errors
- Business transaction volumes: Critical user interactions
- Custom metrics: Measurements specific to your application domain
Correlating these application metrics with JVM metrics helps identify how runtime behavior impacts actual user experience. For example, increased response times during garbage collection pauses clearly show how JVM tuning affects users.
How to Collect JVM Metrics
There are several ways to access JVM metrics, each with different strengths depending on your needs and environment.
JMX (Java Management Extensions)
JMX is built into the JVM and provides access to a wide range of metrics. It's enabled by setting a few JVM parameters at startup.
You can access JMX metrics using:
- JConsole: A graphical monitoring tool included with the JDK
- VisualVM: A visual tool for monitoring and troubleshooting Java applications
- Programmatic access in your monitoring code
// Connect to JMX programmatically
MBeanServerConnection mbsc = ...
ObjectName memoryMXBean = new ObjectName("java.lang:type=Memory");
MemoryUsage heapMemoryUsage = MemoryUsage.from(
(CompositeDataSupport) mbsc.getAttribute(memoryMXBean, "HeapMemoryUsage")
);
JMX-based monitoring works well for development and smaller production deployments but may require additional security configuration for remote access.
JVM Command Line Tools
Java comes with several tools for monitoring the JVM that require no additional installation:
jcmd: Multi-purpose diagnostic command tool
jcmd <pid> GC.heap_info
jstack: Generates thread dumps to identify blocking issues
jstack -l <pid> > threads.txt
jmap: Creates heap dumps for memory analysis
jmap -dump:format=b,file=heap.bin <pid>
jstat: Shows garbage collection statistics
jstat -gcutil <pid> 1000
These tools are invaluable for on-demand troubleshooting but aren't designed for continuous monitoring.
Monitoring Frameworks and Agents
For comprehensive production monitoring, dedicated frameworks provide more robust solutions:
- Metrics libraries: Tools like Micrometer, Dropwizard Metrics, or Prometheus Java Client
- Java agents: Automatically collect data without code changes
- APM solutions: Combine metrics with tracing and profiling
Our observability platform, Last9, is built for AI-native teams and uses lightweight agents to collect JVM metrics and provide centralized dashboards, alerts, and analytics. The platform integrates with your existing monitoring stack while offering more advanced correlation capabilities between metrics, logs, and traces.
Interpreting JVM Metrics for Troubleshooting
Collecting metrics is only half the battle – understanding what they tell you about your application is equally important.
Memory Leak Detection
Memory leaks occur when objects remain referenced but are no longer needed, causing memory usage to grow over time.
Signs of a potential memory leak:
- Steadily increasing heap usage that never plateaus
- Growing old generation without full GCs freeing memory
- Increasing GC frequency with diminishing returns
When you suspect a memory leak:
- Take heap dumps at intervals using jmap
- Compare the dumps to identify growing object collections
- Track down the code responsible for creating and retaining those objects
Addressing memory leaks often requires fixing code logic that maintains references to objects that should be released.
Garbage Collection Problems
GC issues can significantly impact application performance through long pauses or excessive CPU usage.
Common GC problems appear as:
- Long GC pauses causing application freezes
- High GC frequency consumes CPU resources
- Low application throughput due to GC overhead
Solutions to consider:
- Adjust heap size and generation ratios
- Switch to a different GC algorithm
- Reduce object allocation rates in hot code paths
- Use concurrent collectors for latency-sensitive applications
Selecting the right garbage collector for your application's needs can dramatically improve performance. Throughput-focused applications benefit from Parallel GC, while latency-sensitive services should consider G1GC or ZGC.
Thread Contention and Deadlocks
Thread-related issues often cause application responsiveness problems or complete freezes.
Signs of thread problems include:
- High numbers of blocked threads
- Growing thread count over time
- CPU underutilization despite high load
To resolve thread issues:
- Take thread dumps during periods of contention
- Identify hot synchronization points
- Consider using concurrent collections or reducing synchronization scope
- Fix any potential deadlock conditions in your code
Thread issues often require code changes to implement more efficient concurrency patterns.
Advanced JVM Tuning Using Metrics
Metrics provide the data needed to optimize JVM configuration for your specific application needs.
Heap Size Optimization
Proper heap sizing is critical for JVM performance. Use memory metrics to guide your decisions:
- Too small a heap leads to frequent GCs and OutOfMemoryErrors
- Too large a heap causes long GC pauses and wastes resources
Start with a reasonable heap size (around 1/4 of available RAM) and adjust based on observed usage patterns. Monitor GC frequency and pause times after changes to confirm improvements.
Garbage Collector Selection
Different applications have different performance requirements. Choose your collector based on your priorities:
- Throughput-focused applications: Use Parallel GC
- Latency-sensitive services: Consider G1GC or ZGC
- Memory-constrained environments: Try Serial GC
After changing collectors, verify improvements through metrics like GC pause times and throughput.
Thread Pool Tuning
Thread pools need proper sizing based on your application's concurrency needs:
- Too few threads leads to underutilization of resources
- Too many threads creates excessive context switching overhead
Optimize thread pool sizes based on observed utilization rates and response times. A good starting point is setting the maximum pool size to the number of CPU cores plus a small buffer for I/O-bound tasks.
JVM Monitoring with Modern Observability Tools
Several monitoring platforms offer comprehensive JVM observability. Here are three notable options:
Last9
Last9 combines metrics, logs, and traces to provide integrated Java monitoring. The platform automatically collects JVM metrics and correlates them with application behavior, helping identify whether performance issues stem from JVM configuration or application code.
Prometheus + Grafana
This popular open-source combination offers robust JVM monitoring capabilities. Prometheus collects time-series metrics via JMX Exporter while Grafana provides customizable visualization dashboards. This stack excels in scalability and flexibility, though it requires more configuration than commercial solutions.
Dynatrace
Dynatrace provides AI-powered JVM monitoring with automatic dependency mapping. Its OneAgent technology captures detailed JVM metrics with minimal configuration, while its Davis AI engine helps identify root causes of performance problems across complex environments.
A Few Best Practices for JVM Monitoring
Set Up Baseline Metrics
Establish what "normal" looks like for your application by collecting metrics during periods of typical usage. This baseline makes it easier to spot abnormal patterns when they emerge.
Document regular patterns in:
- Memory usage cycles
- GC activity frequency
- Thread behavior
- CPU utilization
Revisit these baselines after significant application changes or traffic pattern shifts.
Create Meaningful Alerts
Don't alert on everything. Focus on actionable conditions that require human intervention:
- Sustained high memory usage (>85% of max heap)
- GC pauses exceeding your SLA thresholds
- Thread counts significantly above normal ranges
- Increasing error rates
- Unusual safe point frequency or duration
Alert thresholds should be based on business impact rather than arbitrary technical values.
Correlate Metrics with Events
Connect changes in metrics with specific events to understand cause and effect:
- Code deployments
- Traffic pattern shifts
- Configuration changes
- Database activity spikes
- External service dependencies
This correlation helps identify what triggered performance changes and guides remediation efforts.
Keep Historical Data
Retain metric history to identify:
- Long-term performance trends
- Cyclical usage patterns
- Gradual degradation issues
- Capacity planning indicators
Historical data proves invaluable when troubleshooting intermittent issues or planning infrastructure changes.
Monitor Across Environments
Implement similar monitoring in development, testing, and production to catch issues earlier in the development cycle. This consistency helps identify environment-specific problems before they affect users.
Implementing JVM Metrics in Your CI/CD Pipeline
Add JVM performance testing to your continuous integration process to catch issues before they reach production.
Key steps include:
- Establish performance baselines with each build. Track metrics for critical operations to detect performance regressions.
- Run load tests that collect JVM metrics.. Simulate realistic usage patterns while monitoring JVM behavior.
- Compare results against previous builds. Look for degradation in key metrics like memory usage and response times.
- Fail the pipeline if metrics degrade beyond thresholds. Enforce performance standards just like code quality standards.
This approach identifies performance regressions early, when they're easier and less expensive to fix.
Wrapping Up
JVM metrics aren’t just numbers—they tell the story of how your Java application is running. By keeping an eye on memory usage, thread activity, and garbage collection, you get the context needed to catch problems early and fix them with confidence.
Last9 helps make this easier by bringing metrics, logs, and traces together in one place—so you’re not jumping between tools when things go wrong. Just the data you need, when you need it.
Talk to us to know more about the platform capabilities!
FAQs
What's the difference between heap and off-heap memory?
Heap memory stores Java objects and is managed by the garbage collector. Off-heap memory includes metaspace, thread stacks, and direct buffers. Off-heap issues won't trigger heap OutOfMemoryErrors but can still crash your application.
How often should I monitor JVM metrics?
For production environments, collect metrics every 15-30 seconds for baseline monitoring. During troubleshooting, increase frequency to 1-5 seconds to capture transient issues.
Can JVM metrics help identify application bugs?
Yes! Unusual metric patterns often point to application issues. Memory leaks typically stem from application code holding references, while thread deadlocks appear as increasing blocked thread counts.
Which metrics matter most for microservices?
For containerized services, focus on:
- Memory usage relative to container limits
- GC pause times (affecting response times)
- Thread count stability
- CPU utilization per thread
- Request throughput and latency
How do I choose the right garbage collector?
Match your collector to your application needs:
- For consistent low latency: Consider ZGC or Shenandoah
- For maximum throughput: Parallel GC works well
- For balanced performance: G1GC offers a good middle ground
- For memory-constrained environments: Serial GC uses less overhead
How can I export JVM metrics to monitoring systems?
JVM metrics can be exported via JMX, agent-based collection, or instrumentation frameworks like Micrometer. These metrics can then flow to time-series databases or monitoring platforms.
What changes in Kubernetes environments?
In Kubernetes:
- Monitor container resource limits versus JVM settings
- Track pod restarts and their correlation with JVM metrics
- Configure heap settings to respect container memory limits
- Implement a proper graceful shutdown to prevent data loss