JVM Metrics: A Complete Guide for Performance Monitoring

Your Java app slows down during peak load. A microservice crashes, but logs aren’t helpful. These aren’t rare events—they’re common signs something’s off inside the JVM.

For Java developers and DevOps teams, JVM metrics offer clues to what’s going on. This blog covers the key metrics to track, what they tell you, and how to use them to troubleshoot performance issues in a practical, no-nonsense way.

What Are JVM Metrics and Why Do They Matter?

JVM metrics are performance indicators that reveal how the Java Virtual Machine operates in real-time. They show what's happening under the hood of your applications.

These metrics serve as early warning systems for performance issues:

They help detect bottlenecks before users notice slowdowns
They provide diagnostic data when applications crash
They inform capacity planning decisions
They guide effective JVM tuning efforts

JVM metrics become even more critical in today's microservice architectures and containerized environments as resources are often constrained and applications scale dynamically.

💡

When JVM metrics point to a problem but you need more context, pairing them with good logging can make debugging much easier — here’s a guide on Java logging best practices.

Essential JVM Metrics You Should Monitor

Heap Memory Usage

The heap is where your Java objects live. Monitoring this area helps you catch memory leaks and determine if you've allocated enough memory.

Key heap metrics to watch:

Used heap memory: How much memory your application is currently using
Max heap size: The total memory available to your application
Eden space usage: Where new objects are created
Survivor space usage: Where objects go after surviving initial garbage collections
Old generation usage: Where long-lived objects reside

Tracking heap usage patterns over time helps identify memory leaks, which appear as steadily increasing memory consumption that never plateaus.

// Example of retrieving heap memory metrics
MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
MemoryUsage heapUsage = memoryBean.getHeapMemoryUsage();

System.out.println("Used Heap: " + heapUsage.getUsed() / (1024 * 1024) + " MB");
System.out.println("Max Heap: " + heapUsage.getMax() / (1024 * 1024) + " MB");

Off-Heap Memory Usage

Off-heap memory refers to memory allocated outside the Java heap but still used by the JVM. Problems here won't trigger standard heap OutOfMemoryErrors but can still crash your application.

Areas to monitor include:

Direct ByteBuffers: Used for efficient I/O operations
Memory-mapped files: Used for reading large files
Native code allocations: Memory used by JNI code
Metaspace: Where class metadata is stored

Watch for steadily increasing off-heap memory usage, as this often indicates leaks in native memory that standard Java profilers might miss. Excessive off-heap memory can affect overall system performance even when heap metrics look normal.

💡

If memory issues show up in your JVM metrics, checking your GC logs is often the next step — this guide breaks down how to read and make sense of them.

Garbage Collection Metrics

GC metrics help you understand how the JVM manages memory and reveal tuning opportunities. In a healthy application, garbage collection runs efficiently without causing noticeable pauses.

Important GC metrics include:

GC pause times: How long does your application freeze during collection
GC frequency: How often collections occur
GC throughput: Percentage of time not spent in GC
Memory reclaimed per collection: Efficiency of garbage collection
Collection counts by generation: Distribution of collection activity

A well-tuned JVM typically spends less than 5% of its time on garbage collection. When this percentage rises above 10%, it often indicates memory pressure that requires attention.

GC Metric	Healthy Range	Warning Signs
GC Pause Time	<100ms	>500ms pauses
Time Spent in GC	<5%	>10%
GC Frequency (Young Gen)	Variable	Sudden increases
GC Frequency (Old Gen)	Rare	Multiple collections per hour

Collection frequency by generation
Percentage of runtime spent in GC
Memory recovered per collection cycle
Collection count by memory region

A well-tuned JVM typically spends under 5% of its time performing garbage collection.

GC Metric	Healthy Range	Warning Signs
GC Pause Time	<100ms	>500ms pauses
Time Spent in GC	<5%	>10%
GC Frequency (Young Gen)	Variable	Sudden increases
GC Frequency (Old Gen)	Rare	Multiple collections per hour

Thread Metrics

Thread metrics help identify concurrency issues, thread leaks, and potential deadlocks. Problems in this area often lead to application freezes and poor responsiveness.

Key thread metrics to monitor:

Thread count: Total number of threads in your application
Runnable threads: Threads actively executing or ready to execute
Blocked threads: Threads waiting for locks
Waiting threads: Threads waiting for a condition to be met
Thread CPU usage: CPU consumption by specific threads

A spike in blocked threads often indicates lock contention, while a steadily growing thread count may signal a thread leak. Either situation can eventually lead to performance degradation or application failure.

When troubleshooting thread issues, capture thread dumps during the problematic periods to identify specific synchronization points causing problems.

JVM Runtime Metrics

These metrics give you insights into the overall health of the JVM and provide context for other, more specific metrics.

Important runtime metrics include:

CPU usage: How much processor time your Java application uses
System load average: Overall demand on your server
Open file descriptors: Track to prevent resource leaks
Uptime: Duration since JVM started
JIT compilation time: Efficiency of code optimization
Class loading metrics: Rate of loading/unloading classes

High CPU usage combined with frequent garbage collection might indicate inefficient code, while high system load with normal JVM CPU usage might suggest resource contention from other processes.

💡

Slowdowns and crashes often lead you to both metrics and logs. For teams using Logback, this guide on configuring it for Java apps can help make logs more useful when debugging.

Application-Specific Metrics

Beyond system metrics, instrument your code to track business-relevant measurements that connect technical performance to user experience.

Consider monitoring:

Request rates: Volume of incoming requests
Response times: How quickly requests are processed
Error rates: Frequency of application errors
Business transaction volumes: Critical user interactions
Custom metrics: Measurements specific to your application domain

Correlating these application metrics with JVM metrics helps identify how runtime behavior impacts actual user experience. For example, increased response times during garbage collection pauses clearly show how JVM tuning affects users.

How to Collect JVM Metrics

There are several ways to access JVM metrics, each with different strengths depending on your needs and environment.

JMX (Java Management Extensions)

JMX is built into the JVM and provides access to a wide range of metrics. It's enabled by setting a few JVM parameters at startup.

You can access JMX metrics using:

JConsole: A graphical monitoring tool included with the JDK
VisualVM: A visual tool for monitoring and troubleshooting Java applications
Programmatic access in your monitoring code

// Connect to JMX programmatically
MBeanServerConnection mbsc = ...
ObjectName memoryMXBean = new ObjectName("java.lang:type=Memory");
MemoryUsage heapMemoryUsage = MemoryUsage.from(
    (CompositeDataSupport) mbsc.getAttribute(memoryMXBean, "HeapMemoryUsage")
);

JMX-based monitoring works well for development and smaller production deployments but may require additional security configuration for remote access.

JVM Command Line Tools

Java comes with several tools for monitoring the JVM that require no additional installation:

jcmd: Multi-purpose diagnostic command tool

jcmd <pid> GC.heap_info

jstack: Generates thread dumps to identify blocking issues

jstack -l <pid> > threads.txt

jmap: Creates heap dumps for memory analysis

jmap -dump:format=b,file=heap.bin <pid>

jstat: Shows garbage collection statistics

jstat -gcutil <pid> 1000

These tools are invaluable for on-demand troubleshooting but aren't designed for continuous monitoring.

Monitoring Frameworks and Agents

For comprehensive production monitoring, dedicated frameworks provide more robust solutions:

Metrics libraries: Tools like Micrometer, Dropwizard Metrics, or Prometheus Java Client
Java agents: Automatically collect data without code changes
APM solutions: Combine metrics with tracing and profiling

Our observability platform, Last9, is built for AI-native teams and uses lightweight agents to collect JVM metrics and provide centralized dashboards, alerts, and analytics. The platform integrates with your existing monitoring stack while offering more advanced correlation capabilities between metrics, logs, and traces.

💡

Now, fix production Java log issues instantly—right from your IDE, with AI and Last9 MCP. Bring real-time production context — logs, metrics, and traces — into your local environment to auto-fix code faster.

Interpreting JVM Metrics for Troubleshooting

Collecting metrics is only half the battle – understanding what they tell you about your application is equally important.

Memory Leak Detection

Memory leaks occur when objects remain referenced but are no longer needed, causing memory usage to grow over time.

Signs of a potential memory leak:

Steadily increasing heap usage that never plateaus
Growing old generation without full GCs freeing memory
Increasing GC frequency with diminishing returns

When you suspect a memory leak:

Take heap dumps at intervals using jmap
Compare the dumps to identify growing object collections
Track down the code responsible for creating and retaining those objects

Addressing memory leaks often requires fixing code logic that maintains references to objects that should be released.

Garbage Collection Problems

GC issues can significantly impact application performance through long pauses or excessive CPU usage.

Common GC problems appear as:

Long GC pauses causing application freezes
High GC frequency consumes CPU resources
Low application throughput due to GC overhead

Solutions to consider:

Adjust heap size and generation ratios
Switch to a different GC algorithm
Reduce object allocation rates in hot code paths
Use concurrent collectors for latency-sensitive applications

Selecting the right garbage collector for your application's needs can dramatically improve performance. Throughput-focused applications benefit from Parallel GC, while latency-sensitive services should consider G1GC or ZGC.

Thread Contention and Deadlocks

Thread-related issues often cause application responsiveness problems or complete freezes.

Signs of thread problems include:

High numbers of blocked threads
Growing thread count over time
CPU underutilization despite high load

To resolve thread issues:

Take thread dumps during periods of contention
Identify hot synchronization points
Consider using concurrent collections or reducing synchronization scope
Fix any potential deadlock conditions in your code

Thread issues often require code changes to implement more efficient concurrency patterns.

💡

Understanding what log data actually captures can make it easier to connect the dots between JVM metrics and application behavior — this explainer breaks it down.

Advanced JVM Tuning Using Metrics

Metrics provide the data needed to optimize JVM configuration for your specific application needs.

Heap Size Optimization

Proper heap sizing is critical for JVM performance. Use memory metrics to guide your decisions:

Too small a heap leads to frequent GCs and OutOfMemoryErrors
Too large a heap causes long GC pauses and wastes resources

Start with a reasonable heap size (around 1/4 of available RAM) and adjust based on observed usage patterns. Monitor GC frequency and pause times after changes to confirm improvements.

Garbage Collector Selection

Different applications have different performance requirements. Choose your collector based on your priorities:

Throughput-focused applications: Use Parallel GC
Latency-sensitive services: Consider G1GC or ZGC
Memory-constrained environments: Try Serial GC

After changing collectors, verify improvements through metrics like GC pause times and throughput.

Thread Pool Tuning

Thread pools need proper sizing based on your application's concurrency needs:

Too few threads leads to underutilization of resources
Too many threads creates excessive context switching overhead

Optimize thread pool sizes based on observed utilization rates and response times. A good starting point is setting the maximum pool size to the number of CPU cores plus a small buffer for I/O-bound tasks.

JVM Monitoring with Modern Observability Tools

Several monitoring platforms offer comprehensive JVM observability. Here are three notable options:

Last9

Last9 combines metrics, logs, and traces to provide integrated Java monitoring. The platform automatically collects JVM metrics and correlates them with application behavior, helping identify whether performance issues stem from JVM configuration or application code.

Prometheus + Grafana

This popular open-source combination offers robust JVM monitoring capabilities. Prometheus collects time-series metrics via JMX Exporter while Grafana provides customizable visualization dashboards. This stack excels in scalability and flexibility, though it requires more configuration than commercial solutions.

Dynatrace

Dynatrace provides AI-powered JVM monitoring with automatic dependency mapping. Its OneAgent technology captures detailed JVM metrics with minimal configuration, while its Davis AI engine helps identify root causes of performance problems across complex environments.

A Few Best Practices for JVM Monitoring

Set Up Baseline Metrics

Establish what "normal" looks like for your application by collecting metrics during periods of typical usage. This baseline makes it easier to spot abnormal patterns when they emerge.

Document regular patterns in:

Memory usage cycles
GC activity frequency
Thread behavior
CPU utilization

Revisit these baselines after significant application changes or traffic pattern shifts.

Create Meaningful Alerts

Don't alert on everything. Focus on actionable conditions that require human intervention:

Sustained high memory usage (>85% of max heap)
GC pauses exceeding your SLA thresholds
Thread counts significantly above normal ranges
Increasing error rates
Unusual safe point frequency or duration

Alert thresholds should be based on business impact rather than arbitrary technical values.

Correlate Metrics with Events

Connect changes in metrics with specific events to understand cause and effect:

Code deployments
Traffic pattern shifts
Configuration changes
Database activity spikes
External service dependencies

This correlation helps identify what triggered performance changes and guides remediation efforts.

Keep Historical Data

Retain metric history to identify:

Long-term performance trends
Cyclical usage patterns
Gradual degradation issues
Capacity planning indicators

Historical data proves invaluable when troubleshooting intermittent issues or planning infrastructure changes.

Monitor Across Environments

Implement similar monitoring in development, testing, and production to catch issues earlier in the development cycle. This consistency helps identify environment-specific problems before they affect users.

Implementing JVM Metrics in Your CI/CD Pipeline

Add JVM performance testing to your continuous integration process to catch issues before they reach production.

Key steps include:

Establish performance baselines with each build. Track metrics for critical operations to detect performance regressions.
Run load tests that collect JVM metrics.. Simulate realistic usage patterns while monitoring JVM behavior.
Compare results against previous builds. Look for degradation in key metrics like memory usage and response times.
Fail the pipeline if metrics degrade beyond thresholds. Enforce performance standards just like code quality standards.

This approach identifies performance regressions early, when they're easier and less expensive to fix.

Wrapping Up

JVM metrics aren’t just numbers—they tell the story of how your Java application is running. By keeping an eye on memory usage, thread activity, and garbage collection, you get the context needed to catch problems early and fix them with confidence.

Last9 helps make this easier by bringing metrics, logs, and traces together in one place—so you’re not jumping between tools when things go wrong. Just the data you need, when you need it.

Talk to us to know more about the platform capabilities!

FAQs

What's the difference between heap and off-heap memory?

Heap memory stores Java objects and is managed by the garbage collector. Off-heap memory includes metaspace, thread stacks, and direct buffers. Off-heap issues won't trigger heap OutOfMemoryErrors but can still crash your application.

How often should I monitor JVM metrics?

For production environments, collect metrics every 15-30 seconds for baseline monitoring. During troubleshooting, increase frequency to 1-5 seconds to capture transient issues.

Can JVM metrics help identify application bugs?

Yes! Unusual metric patterns often point to application issues. Memory leaks typically stem from application code holding references, while thread deadlocks appear as increasing blocked thread counts.

Which metrics matter most for microservices?

For containerized services, focus on:

Memory usage relative to container limits
GC pause times (affecting response times)
Thread count stability
CPU utilization per thread
Request throughput and latency

How do I choose the right garbage collector?

Match your collector to your application needs:

For consistent low latency: Consider ZGC or Shenandoah
For maximum throughput: Parallel GC works well
For balanced performance: G1GC offers a good middle ground
For memory-constrained environments: Serial GC uses less overhead

How can I export JVM metrics to monitoring systems?

JVM metrics can be exported via JMX, agent-based collection, or instrumentation frameworks like Micrometer. These metrics can then flow to time-series databases or monitoring platforms.

What changes in Kubernetes environments?

In Kubernetes:

Monitor container resource limits versus JVM settings
Track pod restarts and their correlation with JVM metrics
Configure heap settings to respect container memory limits
Implement a proper graceful shutdown to prevent data loss