Skip to content
Last9 named a Gartner Cool Vendor in AI for SRE Observability for 2025! Read more →
Last9

Microsoft SQL Server

Monitor Microsoft SQL Server performance, sessions, I/O operations, and logs with OpenTelemetry on Windows for comprehensive database observability

Use OpenTelemetry to monitor Microsoft SQL Server performance and send telemetry data to Last9. This integration provides comprehensive monitoring of SQL Server performance metrics and error logs on Windows, including database I/O, buffer cache statistics, connections, and system resource utilization.

Prerequisites

Before setting up Microsoft SQL Server monitoring, ensure you have:

  • Microsoft SQL Server: SQL Server 2012 or higher installed and running on Windows
  • Windows Server: Windows Server 2016 or higher, or Windows 10/11 for development
  • Administrative Access: SQL Server administrator privileges and Windows admin access
  • Network Connectivity: Outbound HTTPS access to Last9 endpoints
  • Last9 Account: With OpenTelemetry integration credentials
  1. Verify SQL Server Installation

    Ensure SQL Server is installed and running:

    # Check if SQL Server services are running
    Get-Service | Where-Object {$_.Name -like "*SQL*"}
    # Expected services:
    # - MSSQLSERVER (SQL Server Database Engine)
    # - SQLSERVERAGENT (SQL Server Agent)

    If services are not running, start them:

    Start-Service MSSQLSERVER
    Start-Service SQLSERVERAGENT
  2. Configure SQL Server Network Access

    Enable TCP/IP connections for SQL Server:

    1. Open SQL Server Configuration Manager
    2. Navigate to SQL Server Network ConfigurationProtocols for MSSQLSERVER
    3. Right-click TCP/IPEnable
    4. Right-click TCP/IPPropertiesIP Addresses tab
    5. Set TCP Port = 1433 for IPALL
    6. Click OK and restart SQL Server
  3. Enable Mixed Mode Authentication

    Configure SQL Server to allow SQL Server authentication:

    1. Open SQL Server Management Studio (SSMS)
    2. Connect to your SQL Server instance
    3. Right-click server name → Properties
    4. Go to Security tab
    5. Select “SQL Server and Windows Authentication mode”
    6. Click OK and restart SQL Server
  4. Create SQL Server Monitoring User

    Create a dedicated user for OpenTelemetry monitoring with necessary permissions:

    -- Connect using SQL Server Management Studio as administrator
    -- Create login and user
    USE master;
    CREATE LOGIN otel_monitor WITH PASSWORD = 'SecurePassword123!';
    CREATE USER otel_monitor FOR LOGIN otel_monitor;
    -- Grant essential server-level permissions
    GRANT VIEW SERVER STATE TO otel_monitor;
    GRANT VIEW ANY DEFINITION TO otel_monitor;
    ALTER SERVER ROLE ##MS_ServerStateReader## ADD MEMBER otel_monitor;
    -- Grant permissions for SQL Agent job monitoring
    USE msdb;
    CREATE USER otel_monitor FOR LOGIN otel_monitor;
    GRANT SELECT ON dbo.sysjobs TO otel_monitor;
    GRANT SELECT ON dbo.sysjobhistory TO otel_monitor;
    GRANT SELECT ON dbo.sysjobactivity TO otel_monitor;
    -- Test the connection

    Test the monitoring user connection:

    # Test connection with new user
    sqlcmd -S localhost -U otel_monitor -P SecurePassword123! -Q "SELECT @@VERSION"
  5. Configure Environment Variables for Security

    Store sensitive credentials in environment variables:

    Set system-level environment variables (requires Administrator privileges):

    # Set system environment variable
    [Environment]::SetEnvironmentVariable("MSSQL_PASSWORD", "SecurePassword123!", "Machine")
    [Environment]::SetEnvironmentVariable("LAST9_OTLP_ENDPOINT", "your-last9-endpoint", "Machine")
    [Environment]::SetEnvironmentVariable("LAST9_OTLP_AUTH_HEADER", "your-auth-header", "Machine")
    # Verify variables are set
    [Environment]::GetEnvironmentVariable("MSSQL_PASSWORD", "Machine")
  6. Install OpenTelemetry Collector

    Download and install the OpenTelemetry Collector for Windows:

    # Download the MSI installer
    Invoke-WebRequest -Uri "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_windows_amd64.msi" -OutFile "otelcol-contrib_0.118.0_windows_amd64.msi"
    # Install using MSI
    msiexec /i otelcol-contrib_0.118.0_windows_amd64.msi /quiet
    # Or double-click the MSI file to install using the GUI

    Alternative download methods if GitHub is unreliable:

    # Using curl with redirect follow
    curl -L -o otelcol-contrib_0.118.0_windows_amd64.msi "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_windows_amd64.msi"
    # With extended timeout
    Invoke-WebRequest -Uri "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_windows_amd64.msi" -OutFile "otelcol-contrib_0.118.0_windows_amd64.msi" -UseBasicParsing -TimeoutSec 300
  7. Create OpenTelemetry Collector Configuration

    Create the collector configuration file:

    # Create the configuration file
    New-Item -Path "C:\Program Files\OpenTelemetry Collector\config.yaml" -ItemType File -Force
    notepad "C:\Program Files\OpenTelemetry Collector\config.yaml"

    Add the following comprehensive configuration:

    receivers:
    # SQL Server Error Log Collection
    filelog:
    include:
    - 'C:\Program Files\Microsoft SQL Server\MSSQL*.MSSQLSERVER\MSSQL\Log\ERRORLOG*'
    include_file_path: true
    # SQL Server ERRORLOG uses UTF-16 LE encoding
    encoding: utf-16le
    retry_on_failure:
    enabled: true
    # System Metrics Collection
    hostmetrics:
    collection_interval: 60s
    scrapers:
    cpu:
    metrics:
    system.cpu.logical.count:
    enabled: true
    memory:
    metrics:
    system.memory.utilization:
    enabled: true
    system.memory.limit:
    enabled: true
    disk:
    filesystem:
    metrics:
    system.filesystem.utilization:
    enabled: true
    network:
    paging:
    load:
    process:
    mute_process_user_error: true
    mute_process_io_error: true
    mute_process_exe_error: true
    metrics:
    process.cpu.utilization:
    enabled: true
    process.memory.utilization:
    enabled: true
    # SQL Server Performance Metrics
    sqlserver:
    server: localhost
    port: 1433
    username: otel_monitor
    password: ${env:MSSQL_PASSWORD} # Using environment variable
    collection_interval: 60s
    processors:
    batch:
    timeout: 30s
    send_batch_size: 10000
    send_batch_max_size: 10000
    # System resource detection
    resourcedetection/system:
    detectors: ["system"]
    system:
    hostname_sources: ["os"]
    # Cloud resource detection (AWS, Azure, GCP)
    resourcedetection/cloud:
    detectors: ["env", "azure", "aws", "gcp"]
    timeout: 2s
    override: false
    # Log transformation and enrichment
    transform/logs:
    flatten_data: true
    log_statements:
    - context: log
    statements:
    - set(observed_time, Now())
    - set(time_unix_nano, observed_time_unix_nano) where time_unix_nano == 0
    - set(resource.attributes["service.name"], "mssql-server")
    - set(resource.attributes["deployment.environment"], "production")
    - set(resource.attributes["database.system"], "mssql")
    # Metrics transformation and enrichment
    transform/metrics:
    metric_statements:
    - context: datapoint
    statements:
    - set(resource.attributes["service.name"], "mssql-server")
    - set(resource.attributes["deployment.environment"], "production")
    - set(resource.attributes["database.system"], "mssql")
    # Host metrics enrichment
    transform/hostmetrics:
    metric_statements:
    - context: datapoint
    statements:
    - set(attributes["host.name"], resource.attributes["host.name"])
    - set(attributes["cloud.account.id"], resource.attributes["cloud.account.id"])
    - set(attributes["cloud.availability_zone"], resource.attributes["cloud.availability_zone"])
    - set(attributes["cloud.platform"], resource.attributes["cloud.platform"])
    - set(attributes["cloud.provider"], resource.attributes["cloud.provider"])
    - set(attributes["cloud.region"], resource.attributes["cloud.region"])
    exporters:
    otlp/last9:
    endpoint: "${env:LAST9_OTLP_ENDPOINT}"
    headers:
    "Authorization": "${env:LAST9_OTLP_AUTH_HEADER}"
    debug:
    verbosity: detailed
    service:
    pipelines:
    # Log pipeline for SQL Server error logs
    logs:
    receivers: [filelog]
    processors:
    [
    batch,
    resourcedetection/system,
    resourcedetection/cloud,
    transform/logs,
    ]
    exporters: [otlp/last9]
    # Metrics pipeline for SQL Server performance metrics
    metrics/sqlserver:
    receivers: [sqlserver]
    processors:
    [
    batch,
    resourcedetection/system,
    resourcedetection/cloud,
    transform/metrics,
    ]
    exporters: [otlp/last9]
    # Metrics pipeline for host system metrics
    metrics/host:
    receivers: [hostmetrics]
    processors:
    [
    batch,
    resourcedetection/system,
    resourcedetection/cloud,
    transform/hostmetrics,
    ]
    exporters: [otlp/last9]
  8. Configure OpenTelemetry Collector Service

    Configure the Windows service to use the correct configuration:

    # Stop the service if it's running
    Stop-Service otelcol-contrib -ErrorAction SilentlyContinue
    # Configure the service with correct config file path and feature flags
    sc.exe config otelcol-contrib binPath= "`"C:\Program Files\OpenTelemetry Collector\otelcol-contrib.exe`" --config `"C:\Program Files\OpenTelemetry Collector\config.yaml`" --feature-gates transform.flatten.logs"
    # Set service to start automatically
    Set-Service -Name otelcol-contrib -StartupType Automatic
    # Verify service configuration
    sc.exe qc otelcol-contrib

    Important: Note the space after binPath= - this is required by sc.exe syntax!

  9. Start and Verify OpenTelemetry Service

    Start the service and verify it’s working:

    # Start the service
    Start-Service otelcol-contrib
    # Check service status
    Get-Service otelcol-contrib
    # Expected output: Status should be "Running"

    If the service fails to start, run in foreground mode for debugging:

    # Run collector in foreground to see detailed errors
    & "C:\Program Files\OpenTelemetry Collector\otelcol-contrib.exe" --config "C:\Program Files\OpenTelemetry Collector\config.yaml" --feature-gates transform.flatten.logs

Understanding SQL Server Monitoring

The integration collects comprehensive SQL Server telemetry:

SQL Server Performance Metrics

  • Database I/O Statistics: Read/write operations, latency, and throughput per database
  • Buffer Cache Statistics: Buffer cache hit ratio, page life expectancy, memory pressure
  • Batch Requests: Number of batch requests per second (key performance indicator)
  • Active Connections: Current user connections and session counts
  • Lock Statistics: Lock waits, deadlocks, and lock wait time
  • Wait Statistics: Types of waits and their duration (identifying performance bottlenecks)
  • Transaction Metrics: Transactions per second, transaction log usage

SQL Server Error Logs

  • Startup and Shutdown Events: Service lifecycle events
  • Error Messages: Application errors, I/O errors, and system messages
  • Authentication Events: Failed login attempts and security-related events
  • Backup and Restore Operations: Job completion and failure notifications
  • Configuration Changes: Dynamic configuration updates

Host System Metrics

  • CPU Utilization: Processor usage and load from Windows host
  • Memory Usage: Available memory, used memory, and memory pressure
  • Disk I/O: Read/write operations, latency, and throughput
  • Filesystem Metrics: Disk space utilization and available capacity
  • Network Throughput: Bytes sent/received and network errors

Verification and Testing

  1. Check Service Status

    Verify all services are running properly:

    # Check OpenTelemetry service status
    Get-Service otelcol-contrib
    # Check SQL Server service status
    Get-Service MSSQLSERVER, SQLSERVERAGENT
    # All services should show "Running" status
  2. Test SQL Server Connectivity

    Verify the monitoring user can connect to SQL Server:

    # Test connection with monitoring user
    sqlcmd -S localhost -U otel_monitor -P SecurePassword123! -C -Q "SELECT @@VERSION"
    # Should return SQL Server version information without errors
  3. View OpenTelemetry Collector Logs

    Check the collector logs for any issues:

    # View recent logs in Event Viewer
    Get-EventLog -LogName Application -Source otelcol-contrib -Newest 20
    # Or open Event Viewer GUI
    eventvwr.msc
    # Navigate to: Windows Logs > Application > Filter by Source: otelcol-contrib

    What to look for:

    • ✅ “Everything is ready. Begin running and processing data.”
    • ✅ No SQL Server connection errors
    • ✅ No authentication failures
    • ✅ Successful data export to Last9 endpoint
  4. Generate Test Data

    Create test activity to verify monitoring:

    -- Generate a test error log entry
    RAISERROR('OTEL TEST: Verification test entry for Last9', 16, 1) WITH LOG;
  5. Verify Data in Last9

    Log into your Last9 account and verify data is being received:

    1. Navigate to Last9 Dashboard
    2. Check Metrics section for:
      • sqlserver.* metrics (e.g., sqlserver.batch.request.rate)
      • system.* metrics (e.g., system.cpu.utilization)
    3. Check Logs section for:
      • SQL Server ERRORLOG entries
      • Service name: “mssql-server”

    Expected timeline: Data should appear within 1-2 minutes of starting the service.

Key Metrics to Monitor

Critical Performance Indicators

MetricDescriptionAlert Threshold
sqlserver.batch.request.rateBatch requests per secondBaseline ±50%
sqlserver.user.connection.countActive user connections> 80% of max connections
sqlserver.page.buffer_cache.hit_ratioBuffer cache hit percentage< 90%
sqlserver.page.life_expectancyPage life expectancy in seconds< 300 seconds
sqlserver.lock.wait_timeLock wait time in milliseconds> 500ms average

Capacity Planning

MetricDescriptionMonitoring Focus
system.cpu.utilizationCPU usage percentageSustained > 80%
system.memory.utilizationMemory usage percentage> 80% utilization
system.filesystem.utilizationDisk space usage> 85% full
sqlserver.database.io.read_latencyDatabase read latency> 20ms average
sqlserver.database.io.write_latencyDatabase write latency> 10ms average

Health Monitoring

MetricDescriptionAlert Condition
sqlserver.database.countNumber of databasesUnexpected changes
sqlserver.deadlock.countDeadlock occurrences> 0 per hour
sqlserver.error.countSQL Server errorsIncreasing trend
system.disk.io.read_errorsDisk read errors> 0 errors

Troubleshooting

Service Issues

OpenTelemetry Service Won’t Start:

# Check Event Viewer for specific errors
Get-EventLog -LogName Application -Source otelcol-contrib -Newest 10
# Test configuration in foreground mode
& "C:\Program Files\OpenTelemetry Collector\otelcol-contrib.exe" --config "C:\Program Files\OpenTelemetry Collector\config.yaml" --feature-gates transform.flatten.logs
# Verify config file syntax and path
Test-Path "C:\Program Files\OpenTelemetry Collector\config.yaml"

SQL Server Connection Issues

Cannot Connect to SQL Server:

# Verify SQL Server is running
Get-Service MSSQLSERVER
# Check if TCP/IP is enabled and port 1433 is listening
netstat -an | findstr 1433
# Test connection manually
sqlcmd -S localhost -U otel_monitor -P SecurePassword123! -C -Q "SELECT @@VERSION"

Authentication Failures:

-- Verify Mixed Mode authentication is enabled
SELECT SERVERPROPERTY('IsIntegratedSecurityOnly') AS 'Windows Authentication Only';
-- Should return 0 for Mixed Mode
-- Check if monitoring user exists
SELECT name, is_disabled FROM sys.server_principals WHERE name = 'otel_monitor';
-- Verify user permissions
SELECT
p.state_desc,
p.permission_name,
s.name
FROM sys.server_permissions p
LEFT JOIN sys.server_principals s ON p.grantee_principal_id = s.principal_id
WHERE s.name = 'otel_monitor';

Log Collection Issues

SQL Server Logs Not Being Collected:

# Verify ERRORLOG file path and permissions
Get-ChildItem "C:\Program Files\Microsoft SQL Server\MSSQL*.MSSQLSERVER\MSSQL\Log\ERRORLOG*"
# Check if service has read access
icacls "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\Log"
# Generate test log entry
sqlcmd -S localhost -U otel_monitor -P SecurePassword123! -C -Q "RAISERROR('Test log entry', 16, 1) WITH LOG;"

Network and Firewall Issues

No Data Reaching Last9:

# Test outbound connectivity to Last9
Test-NetConnection -ComputerName your-last9-endpoint-hostname -Port 443
# Check Windows Firewall rules
Get-NetFirewallRule -DisplayName "*SQL*" | Select-Object DisplayName, Enabled, Direction
# Verify environment variables are set
Get-ChildItem Env: | Where-Object {$_.Name -like "*LAST9*"}

Environment Variables Issues

Environment Variables Not Working:

# Verify environment variables are set at Machine level
[Environment]::GetEnvironmentVariable("MSSQL_PASSWORD", "Machine")
[Environment]::GetEnvironmentVariable("LAST9_OTLP_ENDPOINT", "Machine")
# Restart service to reload environment variables
Restart-Service otelcol-contrib
# Test with hardcoded values temporarily (remove after testing)

Advanced Configuration

Multi-Instance Monitoring

Monitor multiple SQL Server instances:

receivers:
sqlserver/instance1:
server: localhost
port: 1433
instance: MSSQLSERVER
username: otel_monitor
password: ${env:MSSQL_PASSWORD}
sqlserver/instance2:
server: localhost
port: 1434
instance: SQL2019
username: otel_monitor
password: ${env:MSSQL_PASSWORD}
service:
pipelines:
metrics/sql1:
receivers: [sqlserver/instance1]
processors: [batch, transform/metrics]
exporters: [otlp/last9]
metrics/sql2:
receivers: [sqlserver/instance2]
processors: [batch, transform/metrics]
exporters: [otlp/last9]

Custom Log Parsing

Parse SQL Server log formats:

processors:
transform/parse_sql_logs:
log_statements:
- context: log
conditions:
- body matches ".*Error.*"
statements:
- set(attributes["log.level"], "ERROR")
- set(attributes["sql.error"], "true")
- context: log
conditions:
- body matches ".*Login failed.*"
statements:
- set(attributes["log.level"], "WARN")
- set(attributes["sql.login_failed"], "true")

Performance Optimization

Optimize for high-volume environments:

receivers:
sqlserver:
collection_interval: 30s # More frequent collection
processors:
batch:
timeout: 10s
send_batch_size: 5000
send_batch_max_size: 8000
memory_limiter:
limit_mib: 512
spike_limit_mib: 128

Best Practices

Security

  • Environment Variables: Store all sensitive credentials in environment variables
  • Least Privilege: Grant monitoring user only necessary permissions
  • Network Security: Use Windows Firewall to restrict SQL Server access
  • Audit Access: Monitor monitoring user access through SQL Server audit logs

Performance

  • Collection Intervals: Balance monitoring frequency with performance impact
  • Resource Limits: Set appropriate memory limits for the collector
  • Batch Processing: Optimize batch sizes for efficient data transmission
  • Log Rotation: Monitor SQL Server ERRORLOG rotation and retention

Monitoring Strategy

  • Baseline Metrics: Establish performance baselines for alerting
  • Multi-Layer Monitoring: Monitor application, database, and system layers
  • Proactive Alerts: Set up alerts for critical metrics before problems occur
  • Capacity Planning: Use trends for proactive capacity planning

Windows Administration

  • Service Management: Configure services for automatic startup and recovery
  • Event Log Monitoring: Regular review of Windows Event Logs
  • System Updates: Keep Windows and SQL Server updated with security patches
  • Backup Strategy: Ensure monitoring doesn’t interfere with backup operations

Need Help?

If you encounter any issues or have questions: