Vibe monitoring with Last9 MCP: Ask your agent to fix production issues! Setup →
Last9 Last9

Apr 16th, ‘25 / 9 min read

How to Connect ELK Stack with Grafana

Learn how to connect ELK with Grafana to bring logs and dashboards together for better visibility across your systems.

How to Connect ELK Stack with Grafana

In today’s distributed systems world, you need clear visibility into logs, metrics, and everything in between to keep systems healthy and reliable. That’s where the ELK Stack and Grafana work well together—each solving a different part of the observability puzzle.

ELK handles the heavy lifting of log collection and processing. Grafana adds intuitive dashboards and powerful visualizations. Put them together, and you’ve got a flexible setup that helps you spot issues, track patterns, and stay ahead of outages.

So how do you get them to talk to each other? And what should you watch out for when wiring them up? Let’s break it down.

How ELK Stack and Grafana Complement Each Other

The ELK Stack provides robust log collection and storage capabilities, while Grafana excels at visualization and multi-source dashboards. Understanding how these technologies work together is essential for building an effective monitoring solution.

Core Components of the ELK Stack

The ELK Stack consists of three primary tools working together:

  • Elasticsearch: A distributed search and analytics engine that serves as the central data store. It excels at full-text search and handles time-series data effectively through its inverted index structure. Elasticsearch runs on port 9200 for HTTP requests and 9300 for inter-node communication.
  • Logstash: A data processing pipeline that ingests, transforms, and enriches data before sending it to Elasticsearch. It supports multiple input sources (files, syslog, beats), filters for data transformation, and various output destinations.
  • Kibana: The original visualization layer for Elasticsearch data. Kibana is excellent for log exploration, ad-hoc queries, and text-based analysis through its discover interface and dashboard capabilities.
💡
If you're still figuring out how ELK, Grafana, and Prometheus fit into the bigger observability picture, this comparison might help: ELK vs Grafana vs Prometheus.

What Grafana Brings to the Table

Grafana complements the ELK Stack by providing:

  • Advanced visualization capabilities with over 30 different panel types
  • Multi-data source dashboards that can combine Elasticsearch data with metrics from Prometheus, InfluxDB, and other sources
  • A robust alerting system with multiple notification channels
  • Template variables for creating dynamic, reusable dashboards
  • Annotation support for correlating events with metric changes

Architecture Patterns for ELK and Grafana Integration

Understanding the typical architecture helps in planning your implementation. Here's how these components typically fit together:

Data Flow in a Combined ELK-Grafana Environment

  1. Collection Layer: Data is gathered from various sources using Filebeat, Metricbeat, or other collectors that ship logs and metrics to Logstash or directly to Elasticsearch.
  2. Processing Layer: Logstash processes the incoming data, parsing logs, extracting fields, and enriching data with additional context.
  3. Storage Layer: Elasticsearch stores the processed data in indices, typically organized by date and data type.
  4. Visualization Layer: Both Kibana and Grafana connect to Elasticsearch, with Kibana handling detailed log exploration and Grafana creating metric-focused dashboards.

When to Use Kibana vs. Grafana

For a well-functioning monitoring setup, use each tool for its strengths:

  • Use Kibana for:
    • Deep log exploration and text searching
    • Ad-hoc investigation of issues
    • Creating visualizations tightly coupled with Elasticsearch features
    • Using Elasticsearch's advanced analysis features
  • Use Grafana for:
    • Creating executive dashboards and overviews
    • Combining data from multiple sources (Elasticsearch, Prometheus, etc.)
    • Advanced alerting requirements
    • Metric-focused visualizations
💡
If you're planning to extend Grafana's capabilities or automate parts of your setup, you might find this guide on the Grafana API a useful starting point.

Establishing Connectivity Between Elasticsearch and Grafana

Setting up the connection between Elasticsearch and Grafana requires careful configuration for security and performance.

Authentication Options and Security Configuration

For a secure integration, consider these authentication methods:

  1. Basic Authentication: Simple username/password authentication, suitable for initial setups but not ideal for production.
  2. API Key Authentication: A more secure approach using generated API keys with specific permissions.
  3. Role-Based Access Control: Create dedicated Elasticsearch roles for Grafana with read-only permissions to specific indices.

For production environments, configure:

  • TLS encryption for all connections
  • Dedicated service accounts with minimal permissions
  • Network-level security through firewalls or VPCs

Optimizing Connection Settings

Key settings to configure in the Grafana Elasticsearch data source:

  • Version: Set the correct Elasticsearch version number
  • Time field: Typically @timestamp for standard ELK setups
  • Min time interval: Set according to your data granularity, often "1m" or "10s"
  • Max concurrent shard requests: Typically 5-10, depending on your cluster size
  • Log message field: The field containing the main message body (often "message")

How to Build Effective Queries for Visualization

Querying Elasticsearch effectively from Grafana requires understanding the query structure and optimization techniques.

Understanding Elasticsearch Query Structure in Grafana

Grafana uses a structured format to query Elasticsearch, with three main components:

  1. Query string: The Lucene query syntax for filtering data
  2. Metrics: The aggregations to perform (count, avg, sum, etc.)
  3. Bucket aggregations: How to group and segment the data

A basic query structure includes:

  • A query filter (e.g., level:error)
  • A metric calculation (e.g., count of documents)
  • Time bucketing (typically date histogram on the timestamp field)

Creating Effective Metric Visualizations

For metric visualizations:

  1. Start with the right question: Define what you're trying to measure before building the query.
  2. Choose appropriate aggregations:
    • For counts and rates, use the count metric or derivative
    • For measurements like response time, use average, percentiles, or max
    • For resource usage, average or max are typically appropriate
  3. Add meaningful dimensions: Group by relevant fields like service name, host, or status code to provide context.
  4. Limit cardinality: Be careful with high-cardinality fields (like user IDs or request IDs) as they can cause performance issues.
💡
Creating custom log analytics dashboards in Last9 lets you visualize and monitor log data using aggregated metrics. Check out this guide on how to build and promote log queries into dashboard visualizations.

Dashboard Types for Different Use Cases

Different monitoring needs require specialized dashboard approaches. Here are three essential dashboard types to consider:

Infrastructure Performance Monitoring

An infrastructure dashboard focuses on system-level metrics and includes:

  • CPU, memory, disk, and network utilization across hosts
  • System load averages over time
  • Disk I/O operations and throughput
  • Running processes and system services

Key visualization types:

  • Gauge panels for current resource usage
  • Time series for historical patterns
  • Stat panels for key metrics
  • Heatmaps for distribution analysis

Application Performance Insights

Application monitoring dashboards track service health and performance:

  • Request rates and response times
  • Error rates and types
  • Database query performance
  • Cache hit/miss ratios
  • Service dependencies and interactions

These dashboards benefit from:

  • Time series panels for tracking metrics over time
  • Tables for listing top error types or slow endpoints
  • Stat panels showing current request rates
  • Bar gauges for SLI/SLO tracking

Business Metrics and User Experience

Business-focused dashboards connect technical metrics to user experience:

  • User activity and engagement metrics
  • Conversion rates and funnel visualization
  • Revenue and transaction metrics
  • Feature usage statistics

For these dashboards:

  • Use clear, non-technical language in titles and descriptions
  • Focus on trends and patterns rather than technical details
  • Include annotations for significant business events
  • Set appropriate refresh intervals (usually less frequent than technical dashboards)
💡
Here’s a breakdown to help clarify where OpenTelemetry and ELK fit best in your observability stack: OpenTelemetry vs ELK.

Performance Optimization Strategies For Operations at Scale

Both Elasticsearch and Grafana require optimization for efficient operation at scale.

Elasticsearch Index Management for Optimal Query Performance

Proper index management significantly impacts query performance:

  1. Implement Index Lifecycle Management (ILM):
    • Hot phase for active writing and querying
    • Warm phase for less frequent queries
    • Cold phase for historical data with minimal querying
    • Delete phase for removing old data
  2. Optimize field mappings:
    • Use keyword fields for exact matching and aggregations
    • Disable indexing on fields not used for searching
    • Apply appropriate numeric field types
  3. Shard management:
    • Size shards appropriately (aim for 20-50GB per shard)
    • Set reasonable replica counts based on your resilience needs
    • Consider time-based index strategies for logs

Grafana Query Efficiency Techniques

Optimize Grafana's Elasticsearch queries:

  1. Limit time ranges appropriately:
    • Match the time range to the use case
    • Use template variables for time intervals
  2. Filter early:
    • Apply filters in the query rather than post-processing
    • Use Lucene query syntax for efficient filtering
  3. Use appropriate aggregations:
    • Date histograms with reasonable bucket sizes
    • Limit terms aggregations to small cardinality fields
    • Use metrics aggregations instead of document queries when possible
  4. Dashboard optimization:
    • Stagger panel refresh times
    • Use template variables for filtering
    • Consider caching for dashboards with expensive queries
💡
Now, fix production ELK and Grafana log issues instantly—right from your IDE, with AI and Last9 MCP.

Addressing Common Integration Challenges

Several common issues arise when connecting Elasticsearch and Grafana. Here's how to identify and resolve them.

Troubleshooting Data Visibility Issues

When panels show "No data points" or incomplete data:

  1. Check index pattern correctness:
    • Verify the pattern matches your actual indices
    • Ensure indices exist for the selected time range
    • Confirm proper permission to the indices
  2. Verify field mapping:
    • Ensure the time field exists and is properly mapped
    • Check that queried fields exist in the mapping
    • Confirm field types match the query types
  3. Test queries directly:
    • Execute the query in Kibana Dev Tools
    • Check query syntax for errors
    • Verify data exists for the specific time range

How to Resolve Performance Bottlenecks

When experiencing slow dashboards or timeouts:

  1. Identify the slow components:
    • Monitor Elasticsearch query times
    • Check Grafana logs for slow requests
    • Monitor resource usage on all components
  2. Optimize expensive queries:
    • Narrow time ranges
    • Reduce aggregation complexity
    • Limit high-cardinality fields in groupings
  3. Adjust resource allocation:
    • Ensure adequate CPU and memory for Elasticsearch
    • Consider dedicated nodes for query workloads
    • Optimize JVM settings for your data volume
Last9’s Telemetry Warehouse now supports Logs and Traces too
Last9’s Telemetry Warehouse now supports Logs and Traces too

Advanced Integration Patterns

For sophisticated monitoring needs, consider these advanced techniques.

Working with High-Cardinality Data

High-cardinality fields (like user IDs or session IDs) require special handling:

  1. Use sampling techniques:
    • Filter to a representative subset of data
    • Use term aggregations with limited sizes
    • Consider percentile aggregations instead of exact values
  2. Implement field value limits:
    • Set reasonable size limits on terms aggregations
    • Use "order by" to focus on the most significant values
    • Consider composite aggregations for high-cardinality grouping
  3. Structural approaches:
    • Use separate indices for high-cardinality data
    • Consider roll-up indices for historical data
    • Implement downsampling for long-term storage

Correlating Data Across Multiple Sources

One of Grafana's strengths is its ability to correlate data from different sources:

  1. Unified time selection:
    • Ensure consistent time ranges across all panels
    • Use the same timestamp field in all data sources
  2. Shared variables:
    • Create template variables usable across data sources
    • Use consistent naming conventions for common dimensions
  3. Correlation techniques:
    • Add annotations from one source to panels from another
    • Create dashboard links between related views
    • Use row groupings to organize related data

Making Informed Decisions About Observability Tools

When evaluating observability solutions, consider these comparison points:

Comparing ELK and Grafana with Last9

Feature ELK + Grafana Last9
Setup Complexity Medium-High - Requires configuration expertise Low - Managed service with guided setup
Query Capabilities Powerful but complex query languages Simplified query interface
Visualization Options Extensive visualization types Standard set of visualizations
Data Retention Control Complete control but requires management Policy-based with reasonable defaults
High-Cardinality Handling Possible but requires careful design Purpose-built for high-cardinality data
Cost Structure Infrastructure + storage costs Event-based pricing
Scaling Complexity Requires expertise to scale effectively Managed scaling

Last9 is designed to handle high cardinality at scale, which can sometimes be challenging with standard ELK configurations.

Probo Cuts Monitoring Costs by 90% with Last9
Probo Cuts Monitoring Costs by 90% with Last9

Conclusion

Integrating ELK Stack with Grafana provides a powerful observability platform by combining Elasticsearch's robust storage and search capabilities with Grafana's advanced visualization and multi-source dashboarding.

While this integration requires careful planning and ongoing optimization, the benefits include comprehensive visibility into your systems, faster troubleshooting through correlated data, and better decision-making based on complete information.

💡
Consider joining our Discord Community to discuss your ELK Grafana setup or share experiences with fellow DevOps and SREs working on similar challenges.

FAQs

How do I handle time zone differences between Elasticsearch and Grafana?

Elasticsearch stores timestamps in UTC. To handle timezone differences:

  • Configure the Grafana data source to use the browser's timezone
  • Set dashboard timezone preferences appropriately
  • For query time ranges, be aware that filters apply in the configured timezone

Can I migrate visualizations from Kibana to Grafana?

There's no direct migration path, but you can recreate visualizations:

  • Recreate each visualization manually in Grafana
  • Use the Elasticsearch query from Kibana as a starting point
  • Adapt the query syntax to Grafana's structure
  • Consider using Grafana's superior templating to enhance the dashboards

What's the most efficient way to monitor both logs and metrics?

For a comprehensive monitoring approach:

  • Use the ELK Stack for detailed log collection and analysis
  • Use Grafana to create dashboards combining logs and metrics
  • Implement consistent tagging across logs and metrics
  • Create correlation dashboards showing metrics with related log volume
  • Set up alerts based on both logs and metrics for complete coverage

How should I structure my Elasticsearch indices for optimal performance?

For best performance with the ELK and Grafana integration:

  • Use time-based indices with appropriate rollover policies
  • Create separate indices for logs and metrics
  • Consider dedicated indices for high-volume sources
  • Implement index templates with optimized mappings
  • Use ILM policies to manage the index lifecycle

What retention strategies work best for long-term data storage?

Effective retention strategies include:

  • Hot-warm-cold architecture for tiered storage
  • Rollup indices for long-term metric storage
  • Snapshot and restore for archival purposes
  • Different retention periods based on data importance
  • Sampling strategies for high-volume, lower-priority data

Contents


Newsletter

Stay updated on the latest from Last9.

Authors
Anjali Udasi

Anjali Udasi

Helping to make the tech a little less intimidating. I love breaking down complex concepts into easy-to-understand terms.