Track MongoDB Performance Metrics Without the Noise

When your MongoDB database slows down, it affects your entire application stack. Performance issues can range from minor inconveniences to major outages, making a solid understanding of MongoDB metrics essential for any DevOps engineer.

This guide covers the key performance metrics you need to monitor in MongoDB, how to interpret what you're seeing, and practical steps to resolve common issues.

MongoDB Architecture for Better Monitoring

Before jumping into metrics, let's quickly review MongoDB's architecture to better understand what we're monitoring. MongoDB uses a document-oriented model where data is stored in flexible JSON-like documents. The core components include:

mongod: The primary database process that handles data requests and manages data files
WiredTiger: The default storage engine since MongoDB 3.2, managing how data is stored on disk
Collections: Similar to tables in relational databases, collections store related documents
Replica Sets: Groups of mongod instances that maintain identical data copies for redundancy
Sharding: The method MongoDB uses to horizontally partition data across multiple servers

Each component generates specific metrics that provide insights into database health and performance.

💡

For a broader look at monitoring MongoDB in production, check out this guide on MongoDB monitoring best practices.

What Makes MongoDB Performance Metrics Different?

MongoDB's document-oriented structure means its performance profile differs significantly from traditional relational databases. Instead of tables and rows, MongoDB stores data in flexible JSON-like documents, which changes how we approach performance monitoring.

The distributed nature of MongoDB also introduces unique metrics around replication lag, shard distribution, and cluster health that SQL databases typically don't have. This requires a specialized approach to performance monitoring.

Critical MongoDB Metrics You Should Track

Let's break down the most critical metrics you should track for optimal MongoDB performance:

Measure Query Performance

Query performance is often the first indicator of database health. Slow queries can bottleneck your entire application. Key metrics include:

Query execution time: How long queries take to complete
Query scanning efficiency: Documents scanned vs. returned
Index usage: Whether queries use indexes effectively
globalLock metrics: Time operations spend waiting for locks
currentOp: Currently running operations and their execution time

MongoDB's explain() method gives you visibility into how queries execute:

db.collection.find({status: "active"}).explain("executionStats")

This command shows execution time, documents examined, and whether indexes were used—all crucial data points for understanding query efficiency.

Analyze Connection Patterns

Connection issues can lead to cascading failures in your application. Keep an eye on:

Current connections: The number of active client connections
Available connections: How many more connections can MongoDB accept
Connection creation rate: Sudden spikes may indicate connection leaks
Active clients: Count of clients with active operations
Network traffic: Bytes in/out per second to identify bandwidth issues

When connection counts approach your configured limit (default: 65,536), new client requests get rejected, causing application errors that can be hard to diagnose without proper metrics.

Optimize CPU & Memory Usage

MongoDB's performance depends heavily on having enough CPU and memory resources:

CPU utilization: MongoDB is CPU-intensive during query execution and sorting
System context switches: Excessive context switching indicates CPU contention
Working set size: Data MongoDB needs to keep in RAM
WiredTiger cache usage: Percentage of the cache being used
Page faults: When MongoDB needs to fetch data from disk
Memory fragmentation: Wasted memory due to fragmentation

High page fault rates usually mean your working set doesn't fit in RAM, which dramatically slows performance as MongoDB must read from disk.

💡

If you're also managing logs, here’s a guide on working with MongoDB logs to help you troubleshoot more effectively.

Evaluate Disk Performance

Since MongoDB ultimately stores data on disk, I/O performance directly impacts database speed:

Disk utilization: Percentage of time the disk is busy
Read/write latency: Time taken for disk operations
IOPS (Input/Output Operations Per Second): Rate of disk operations
WiredTiger block manager metrics: File allocation patterns
Journaling stats: Write-ahead log performance metrics
Disk queue depth: Number of pending I/O operations

Storage bottlenecks often manifest as high disk utilization with relatively few operations, indicating your disks can't keep up with MongoDB's demands.

Quantify Database Operations

MongoDB's per-operation counters help identify specific bottlenecks:

Insert/update/delete rates: Volume of write operations
Read rates: Volume of read operations
Getmore operations: Cursor operations for retrieving batches of results
Command operations: Rate of database commands being executed
Queued operations: Operations waiting to be processed
Scan and order operations: Queries that require in-memory sorting

Unexpected changes in operation rates can signal application issues or inefficient query patterns that need attention.

Ensure Replication Health

For replica sets, these metrics are crucial for data consistency:

Replication lag: Delay between operations on the primary and their application on secondaries
Oplog window: Time range of operations stored in the oplog
Replication buffer usage: Memory used for storing operations not yet applied
Election metrics: Frequency and duration of primary elections
Heartbeat latency: Time taken for replica set members to communicate

High replication lag can lead to stale reads and potential data loss during failover events.

Balance Sharded Clusters

If you're using sharded clusters, monitor these additional metrics:

Chunk distribution: How evenly the data is distributed across shards
Balancer activity: Frequency and duration of chunk migrations
Jumbo chunks: Chunks that exceed the maximum size and can't be split
Split operations: Rate at which chunks are being split
Query targeting: Whether queries go to all shards or are targeted correctly

Unbalanced shards can lead to hotspots where some servers work much harder than others.

💡

Curious how MongoDB stacks up against Elasticsearch for your use case? This comparison of MongoDB vs. Elasticsearch breaks it down.

Deploy Basic MongoDB Monitoring Tools

Before jumping into advanced tools, start with MongoDB's built-in monitoring capabilities:

Run mongostat for Real-time Insights

The mongostat Utility gives you a real-time view of database operations:

mongostat --port 27017 --authenticationDatabase admin -u username -p password

This command displays metrics like operations per second, memory usage, and connection counts updated every second.

Apply mongotop to Find Hotspots

To see which collections receive the most read/write activity, use mongotop:

mongotop --port 27017 --authenticationDatabase admin -u username -p password

This helps identify hot collections that might benefit from additional optimization.

Configure Database Profiling

For deeper insights into slow queries, enable MongoDB's profiler:

db.setProfilingLevel(1, { slowms: 100 })

This logs all operations taking longer than 100ms to the system.profile collection, which you can query to find problematic operations:

db.system.profile.find().sort({millis: -1}).limit(10)

Extract serverStatus Metrics

The serverStatus command provides comprehensive metrics about your MongoDB instance:

db.adminCommand('serverStatus')

This returns detailed information about:

WiredTiger cache statistics
Connection counts
Operation counters
Memory usage
Replication status
Global locks

Review this output regularly to spot trends and anomalies.

💡

For a broader view on what to track across different databases, check out this guide on key database monitoring metrics.

Advanced Performance Techniques

Once you've mastered the basics, try these advanced techniques to squeeze more performance from your MongoDB deployment:

Design Efficient Indexes

Indexes dramatically improve query performance, but each index adds overhead to write operations. The key is finding the right balance:

Compound indexes: Create indexes that support multiple query patterns
Covered queries: Design indexes so queries can be satisfied entirely from the index
Partial indexes: Index only the documents that match certain criteria

For example, instead of separate indexes on user_id and created_at, consider a compound index:

db.orders.createIndex({ user_id: 1, created_at: -1 })

This supports queries filtering by user, sorting by date, or both — making it more versatile than two separate indexes.

Structure Your Schema for Speed

Your document structure significantly impacts MongoDB performance:

Right-size documents: Excessively large documents increase memory usage and network overhead
Denormalize strategically: Include related data in a single document when it makes sense
Use appropriate data types: Storing data with the correct type improves index efficiency

Consider this example of a denormalized schema that reduces the need for joins:

// Instead of separate collections
{
  "_id": ObjectId("5f8d"),
  "title": "Widget Pro",
  "price": 29.99,
  "category": {
    "name": "Electronics", 
    "tax_rate": 0.07
  }
}

Configure Connection Pools

Properly configured connection pools reduce the overhead of establishing new connections:

Size pools appropriately: Too small causes queuing, too large wastes resources
Monitor pool metrics: Track utilization and wait times
Use separate pools for different workloads: Isolate read-heavy from write-heavy operations

Most MongoDB drivers support connection pooling, but you need to configure it correctly:

// Node.js example
const client = new MongoClient(uri, {
  poolSize: 50,
  maxIdleTimeMS: 30000
});

Solve Common MongoDB Performance Problems

Even with proper monitoring, you'll occasionally face performance challenges. Here's how to diagnose and fix common issues:

Accelerate Slow Queries

When queries run slowly:

Check if appropriate indexes exist
Examine query patterns for potential optimizations
Look for excessive document scanning

For example, if you see a query scanning millions of documents but returning only a few, you likely need an index:

// Before adding an index
db.users.find({status: "active", region: "europe"}).explain("executionStats")
// Shows 1,000,000 documents scanned for 1,000 results

// Add a compound index
db.users.createIndex({status: 1, region: 1})

// After adding the index
db.users.find({status: "active", region: "europe"}).explain("executionStats")
// Shows 1,000 documents scanned for 1,000 results

💡

Now, fix production MongoDB performance issues instantly—right from your IDE, with AI and Last9 MCP. Bring real-time context—logs, metrics, and traces—into your local environment to troubleshoot and resolve faster.

Reduce Memory Pressure

If you're experiencing high page fault rates:

Increase available RAM
Add shards to distribute data
Implement data archiving strategies

You can check memory usage patterns with:

db.serverStatus().mem

Eliminate Lock Contention

Lock contention occurs when operations wait for access to resources:

Identify operations causing locks
Break large operations into smaller batches
Schedule maintenance during off-peak hours

Monitor lock metrics with:

db.serverStatus().locks

Unify Metrics with Last9 Monitoring

MongoDB's built-in monitoring tools provide valuable data, but they often leave you assembling pieces of the puzzle separately. Last9the brings these fragments together by connecting your MongoDB performance metrics with related traces and logs.

Last9 enhances MongoDB monitoring by:

Correlating query performance metrics with application transactions
Tracking WiredTiger cache statistics alongside memory usage patterns
Visualizing replication lag metrics together with application impact
Connecting high-latency MongoDB operations with affected user journeys
Providing historical context for MongoDB metrics to identify performance trends

When troubleshooting MongoDB issues, this unified approach proves invaluable. Instead of jumping between different tools to understand why your queries are slow, Last9 shows the complete picture - from storage engine metrics to application response times.

For instance, when query execution time spikes, Last9 automatically links this MongoDB metric to the corresponding application endpoints. You can instantly see which collections are experiencing lock contention, how the WiredTiger cache is performing, and which user flows are impacted - information that would normally require coordinating data from multiple monitoring systems.

Comparing MongoDB Monitoring Solutions

While Last9 offers comprehensive MongoDB monitoring, let's look at the broader landscape of monitoring options:

Evaluate Built-in MongoDB Tools

MongoDB provides several built-in tools for basic monitoring:

MongoDB Compass: GUI for exploring data and monitoring performance
MongoDB Atlas: Cloud service with integrated monitoring dashboards
Server Status Commands: Direct database commands to retrieve metrics

These native tools are great for quick checks but lack the depth and correlation capabilities needed for production environments.

Use Open Source Options

Several open-source tools can monitor MongoDB:

Prometheus + MongoDB Exporter: Collects MongoDB metrics in Prometheus format
Grafana: Creates visualization dashboards for MongoDB metrics
Percona Monitoring and Management: Specialized for MongoDB performance monitoring

Open-source solutions offer flexibility but require significant setup and maintenance effort.

💡

Last9 includes robust alerting and notification features, built to address common issues like alert fatigue, coverage gaps, and noisy cleanups—problems that rarely have simple fixes.

How to Configure Effective MongoDB Alerts

Instead of alerting on everything, focus on actionable metrics:

Define Actionable Alert Thresholds

Set up alerts for conditions that truly require attention:

Replication lag exceeding 60 seconds: Indicates potential data consistency issues
Query execution time above historical baseline: Shows performance degradation
Connection usage above 80% of maximum: Provides time to address before connections are exhausted

Focus on Percentiles Over Averages

Averages hide problems affecting a subset of operations. Track the 95th and 99th percentiles for more accurate alerts:

95th percentile query time > 100ms: Catches slow queries affecting 5% of operations
99th percentile write latency > 50ms: Identifies issues affecting 1% of writes

Single alerts often don't tell the full story. Last9 helps correlate related alerts to identify root causes:

Connect high CPU usage with increased query times
Link memory pressure to specific collection growth
Correlate network issues with replication lag

This approach reduces noise and helps you focus on solving the underlying problem rather than addressing symptoms.

💡

If you're managing MySQL alongside MongoDB, this guide on MySQL logs can help you stay on top of slow queries and errors.

MongoDB Performance Best Practices

After working with hundreds of MongoDB deployments, here are the best practices that consistently deliver results:

Match Hardware to Workload

Match your infrastructure to your workload:

RAM: Provision enough to hold your working set (typically 1.1x its size)
CPU: Scale based on query complexity and concurrency
Disk: Use SSDs for production MongoDB deployments

Schedule Regular Maintenance

Proactive maintenance prevents performance degradation:

Compact databases: Run periodic compaction to reclaim space
Update indexes: Rebuild indexes periodically to reduce fragmentation
Monitor for fragmentation: Check and address both data and index fragmentation

Simulate Production Loads

Don't wait for production issues—simulate load to find bottlenecks:

Use tools like mongoperf to test disk performance
Create realistic test data that matches production patterns
Simulate peak workloads with tools like JMeter or custom load tests

Conclusion

Keeping an eye on a few key MongoDB metrics can go a long way. Focus on what helps you spot slowdowns or odd behavior early. Start simple, adjust as your needs grow, and let real issues guide what you monitor next.

💡

And if you want to talk more about setup tips or trade ideas with others, our Discord community is always open.

FAQs

How often should I check MongoDB performance metrics?

For production systems, check key metrics at least every 5 minutes. Set up dashboards for real-time monitoring and review daily performance trends to spot gradual degradation.

Which MongoDB metrics are most important for my API service?

Focus on query response times, index usage efficiency, and connection counts. These directly impact API performance. Also, monitor memory usage to ensure your working set fits in RAM.

Can I use the same monitoring approach for MongoDB Atlas?

MongoDB Atlas provides its monitoring interface, but the same metrics matter. You can also integrate Atlas with Last9 to correlate database performance with your application metrics.

How do I tell if my indexes are effective?

Check the totalKeysExamined vs. totalDocsExamined in query, explain plans. Effective indexes show similar numbers for both metrics. Large differences indicate table scans rather than index usage.

What's the impact of replica set elections on performance?

Primary elections typically cause 5-30 seconds of write unavailability. Monitor replSetGetStatus to track election events and measure their impact on your application.

How can I predict when I'll need to scale my MongoDB deployment?

Track growth trends in data size, operation counts, and resource utilization. When any metric consistently exceeds 70% of capacity, it's time to plan for scaling.

What are the key WiredTiger cache metrics to monitor?

Watch the "cache dirty %" (ideally under 10%) and "cache used %" (ideally under 80%). Also monitor "pages read into cache" and "pages written from cache" to understand cache efficiency.

How do I monitor the MongoDB oplog window?

Use db.getReplicationInfo() to see the oplog time range. Ensure it's large enough to accommodate your longest expected primary downtime, typically at least 24 hours.

How do I identify network issues affecting MongoDB?

Monitor network metrics like bytes in/out, network latency between cluster members, and connection errors. Correlate spikes in network latency with query performance degradation.