Microsoft SQL Server
Monitor Microsoft SQL Server performance, sessions, I/O operations, and logs with OpenTelemetry on Windows for comprehensive database observability
Use OpenTelemetry to monitor Microsoft SQL Server performance and send telemetry data to Last9. This integration provides comprehensive monitoring of SQL Server performance metrics and error logs on Windows, including database I/O, buffer cache statistics, connections, and system resource utilization.
Prerequisites
Before setting up Microsoft SQL Server monitoring, ensure you have:
- Microsoft SQL Server: SQL Server 2012 or higher installed and running on Windows
- Windows Server: Windows Server 2016 or higher, or Windows 10/11 for development
- Administrative Access: SQL Server administrator privileges and Windows admin access
- Network Connectivity: Outbound HTTPS access to Last9 endpoints
- Last9 Account: With OpenTelemetry integration credentials
-
Verify SQL Server Installation
Ensure SQL Server is installed and running:
# Check if SQL Server services are runningGet-Service | Where-Object {$_.Name -like "*SQL*"}# Expected services:# - MSSQLSERVER (SQL Server Database Engine)# - SQLSERVERAGENT (SQL Server Agent)If services are not running, start them:
Start-Service MSSQLSERVERStart-Service SQLSERVERAGENT -
Configure SQL Server Network Access
Enable TCP/IP connections for SQL Server:
- Open SQL Server Configuration Manager
- Navigate to SQL Server Network Configuration → Protocols for MSSQLSERVER
- Right-click TCP/IP → Enable
- Right-click TCP/IP → Properties → IP Addresses tab
- Set TCP Port = 1433 for IPALL
- Click OK and restart SQL Server
Configure Windows Firewall and verify network settings:
# Add Windows Firewall rule for SQL Servernetsh advfirewall firewall add rule name="SQL Server" dir=in action=allow protocol=TCP localport=1433# Restart SQL Server servicesRestart-Service MSSQLSERVER# Verify port is listeningnetstat -an | findstr 1433# Should show: TCP 0.0.0.0:1433 0.0.0.0:0 LISTENING -
Enable Mixed Mode Authentication
Configure SQL Server to allow SQL Server authentication:
- Open SQL Server Management Studio (SSMS)
- Connect to your SQL Server instance
- Right-click server name → Properties
- Go to Security tab
- Select “SQL Server and Windows Authentication mode”
- Click OK and restart SQL Server
-- Enable mixed mode authenticationEXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE',N'Software\Microsoft\MSSQLServer\MSSQLServer',N'LoginMode', REG_DWORD, 2;-- Restart required for changes to take effectAfter running this command, restart SQL Server:
Restart-Service MSSQLSERVER -
Create SQL Server Monitoring User
Create a dedicated user for OpenTelemetry monitoring with necessary permissions:
-- Connect using SQL Server Management Studio as administrator-- Create login and userUSE master;CREATE LOGIN otel_monitor WITH PASSWORD = 'SecurePassword123!';CREATE USER otel_monitor FOR LOGIN otel_monitor;-- Grant essential server-level permissionsGRANT VIEW SERVER STATE TO otel_monitor;GRANT VIEW ANY DEFINITION TO otel_monitor;ALTER SERVER ROLE ##MS_ServerStateReader## ADD MEMBER otel_monitor;-- Grant permissions for SQL Agent job monitoringUSE msdb;CREATE USER otel_monitor FOR LOGIN otel_monitor;GRANT SELECT ON dbo.sysjobs TO otel_monitor;GRANT SELECT ON dbo.sysjobhistory TO otel_monitor;GRANT SELECT ON dbo.sysjobactivity TO otel_monitor;-- Test the connectionTest the monitoring user connection:
# Test connection with new usersqlcmd -S localhost -U otel_monitor -P SecurePassword123! -Q "SELECT @@VERSION" -
Configure Environment Variables for Security
Store sensitive credentials in environment variables:
Set system-level environment variables (requires Administrator privileges):
# Set system environment variable[Environment]::SetEnvironmentVariable("MSSQL_PASSWORD", "SecurePassword123!", "Machine")[Environment]::SetEnvironmentVariable("LAST9_OTLP_ENDPOINT", "your-last9-endpoint", "Machine")[Environment]::SetEnvironmentVariable("LAST9_OTLP_AUTH_HEADER", "your-auth-header", "Machine")# Verify variables are set[Environment]::GetEnvironmentVariable("MSSQL_PASSWORD", "Machine")Set user-level environment variables:
# Set for current user[Environment]::SetEnvironmentVariable("MSSQL_PASSWORD", "SecurePassword123!", "User")[Environment]::SetEnvironmentVariable("LAST9_OTLP_ENDPOINT", "your-last9-endpoint", "User")[Environment]::SetEnvironmentVariable("LAST9_OTLP_AUTH_HEADER", "your-auth-header", "User") -
Install OpenTelemetry Collector
Download and install the OpenTelemetry Collector for Windows:
# Download the MSI installerInvoke-WebRequest -Uri "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_windows_amd64.msi" -OutFile "otelcol-contrib_0.118.0_windows_amd64.msi"# Install using MSImsiexec /i otelcol-contrib_0.118.0_windows_amd64.msi /quiet# Or double-click the MSI file to install using the GUIAlternative download methods if GitHub is unreliable:
# Using curl with redirect followcurl -L -o otelcol-contrib_0.118.0_windows_amd64.msi "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_windows_amd64.msi"# With extended timeoutInvoke-WebRequest -Uri "https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_windows_amd64.msi" -OutFile "otelcol-contrib_0.118.0_windows_amd64.msi" -UseBasicParsing -TimeoutSec 300 -
Create OpenTelemetry Collector Configuration
Create the collector configuration file:
# Create the configuration fileNew-Item -Path "C:\Program Files\OpenTelemetry Collector\config.yaml" -ItemType File -Forcenotepad "C:\Program Files\OpenTelemetry Collector\config.yaml"Add the following comprehensive configuration:
receivers:# SQL Server Error Log Collectionfilelog:include:- 'C:\Program Files\Microsoft SQL Server\MSSQL*.MSSQLSERVER\MSSQL\Log\ERRORLOG*'include_file_path: true# SQL Server ERRORLOG uses UTF-16 LE encodingencoding: utf-16leretry_on_failure:enabled: true# System Metrics Collectionhostmetrics:collection_interval: 60sscrapers:cpu:metrics:system.cpu.logical.count:enabled: truememory:metrics:system.memory.utilization:enabled: truesystem.memory.limit:enabled: truedisk:filesystem:metrics:system.filesystem.utilization:enabled: truenetwork:paging:load:process:mute_process_user_error: truemute_process_io_error: truemute_process_exe_error: truemetrics:process.cpu.utilization:enabled: trueprocess.memory.utilization:enabled: true# SQL Server Performance Metricssqlserver:server: localhostport: 1433username: otel_monitorpassword: ${env:MSSQL_PASSWORD} # Using environment variablecollection_interval: 60sprocessors:batch:timeout: 30ssend_batch_size: 10000send_batch_max_size: 10000# System resource detectionresourcedetection/system:detectors: ["system"]system:hostname_sources: ["os"]# Cloud resource detection (AWS, Azure, GCP)resourcedetection/cloud:detectors: ["env", "azure", "aws", "gcp"]timeout: 2soverride: false# Log transformation and enrichmenttransform/logs:flatten_data: truelog_statements:- context: logstatements:- set(observed_time, Now())- set(time_unix_nano, observed_time_unix_nano) where time_unix_nano == 0- set(resource.attributes["service.name"], "mssql-server")- set(resource.attributes["deployment.environment"], "production")- set(resource.attributes["database.system"], "mssql")# Metrics transformation and enrichmenttransform/metrics:metric_statements:- context: datapointstatements:- set(resource.attributes["service.name"], "mssql-server")- set(resource.attributes["deployment.environment"], "production")- set(resource.attributes["database.system"], "mssql")# Host metrics enrichmenttransform/hostmetrics:metric_statements:- context: datapointstatements:- set(attributes["host.name"], resource.attributes["host.name"])- set(attributes["cloud.account.id"], resource.attributes["cloud.account.id"])- set(attributes["cloud.availability_zone"], resource.attributes["cloud.availability_zone"])- set(attributes["cloud.platform"], resource.attributes["cloud.platform"])- set(attributes["cloud.provider"], resource.attributes["cloud.provider"])- set(attributes["cloud.region"], resource.attributes["cloud.region"])exporters:otlp/last9:endpoint: "${env:LAST9_OTLP_ENDPOINT}"headers:"Authorization": "${env:LAST9_OTLP_AUTH_HEADER}"debug:verbosity: detailedservice:pipelines:# Log pipeline for SQL Server error logslogs:receivers: [filelog]processors:[batch,resourcedetection/system,resourcedetection/cloud,transform/logs,]exporters: [otlp/last9]# Metrics pipeline for SQL Server performance metricsmetrics/sqlserver:receivers: [sqlserver]processors:[batch,resourcedetection/system,resourcedetection/cloud,transform/metrics,]exporters: [otlp/last9]# Metrics pipeline for host system metricsmetrics/host:receivers: [hostmetrics]processors:[batch,resourcedetection/system,resourcedetection/cloud,transform/hostmetrics,]exporters: [otlp/last9] -
Configure OpenTelemetry Collector Service
Configure the Windows service to use the correct configuration:
# Stop the service if it's runningStop-Service otelcol-contrib -ErrorAction SilentlyContinue# Configure the service with correct config file path and feature flagssc.exe config otelcol-contrib binPath= "`"C:\Program Files\OpenTelemetry Collector\otelcol-contrib.exe`" --config `"C:\Program Files\OpenTelemetry Collector\config.yaml`" --feature-gates transform.flatten.logs"# Set service to start automaticallySet-Service -Name otelcol-contrib -StartupType Automatic# Verify service configurationsc.exe qc otelcol-contribImportant: Note the space after
binPath=- this is required bysc.exesyntax! -
Start and Verify OpenTelemetry Service
Start the service and verify it’s working:
# Start the serviceStart-Service otelcol-contrib# Check service statusGet-Service otelcol-contrib# Expected output: Status should be "Running"If the service fails to start, run in foreground mode for debugging:
# Run collector in foreground to see detailed errors& "C:\Program Files\OpenTelemetry Collector\otelcol-contrib.exe" --config "C:\Program Files\OpenTelemetry Collector\config.yaml" --feature-gates transform.flatten.logs
Understanding SQL Server Monitoring
The integration collects comprehensive SQL Server telemetry:
SQL Server Performance Metrics
- Database I/O Statistics: Read/write operations, latency, and throughput per database
- Buffer Cache Statistics: Buffer cache hit ratio, page life expectancy, memory pressure
- Batch Requests: Number of batch requests per second (key performance indicator)
- Active Connections: Current user connections and session counts
- Lock Statistics: Lock waits, deadlocks, and lock wait time
- Wait Statistics: Types of waits and their duration (identifying performance bottlenecks)
- Transaction Metrics: Transactions per second, transaction log usage
SQL Server Error Logs
- Startup and Shutdown Events: Service lifecycle events
- Error Messages: Application errors, I/O errors, and system messages
- Authentication Events: Failed login attempts and security-related events
- Backup and Restore Operations: Job completion and failure notifications
- Configuration Changes: Dynamic configuration updates
Host System Metrics
- CPU Utilization: Processor usage and load from Windows host
- Memory Usage: Available memory, used memory, and memory pressure
- Disk I/O: Read/write operations, latency, and throughput
- Filesystem Metrics: Disk space utilization and available capacity
- Network Throughput: Bytes sent/received and network errors
Verification and Testing
-
Check Service Status
Verify all services are running properly:
# Check OpenTelemetry service statusGet-Service otelcol-contrib# Check SQL Server service statusGet-Service MSSQLSERVER, SQLSERVERAGENT# All services should show "Running" status -
Test SQL Server Connectivity
Verify the monitoring user can connect to SQL Server:
# Test connection with monitoring usersqlcmd -S localhost -U otel_monitor -P SecurePassword123! -C -Q "SELECT @@VERSION"# Should return SQL Server version information without errors -
View OpenTelemetry Collector Logs
Check the collector logs for any issues:
# View recent logs in Event ViewerGet-EventLog -LogName Application -Source otelcol-contrib -Newest 20# Or open Event Viewer GUIeventvwr.msc# Navigate to: Windows Logs > Application > Filter by Source: otelcol-contribWhat to look for:
- ✅ “Everything is ready. Begin running and processing data.”
- ✅ No SQL Server connection errors
- ✅ No authentication failures
- ✅ Successful data export to Last9 endpoint
-
Generate Test Data
Create test activity to verify monitoring:
-- Generate a test error log entryRAISERROR('OTEL TEST: Verification test entry for Last9', 16, 1) WITH LOG;-- Create test database activitySELECT COUNT(*) FROM sys.tables;SELECT COUNT(*) FROM sys.databases;SELECT COUNT(*) FROM sys.dm_exec_sessions;-- Check current connectionsSELECTDB_NAME(database_id) as database_name,COUNT(*) as connection_countFROM sys.dm_exec_sessionsWHERE database_id > 0GROUP BY database_id;# Generate failed login event (creates security log entries)sqlcmd -S localhost -U wrong_user -P WrongPassword! -C -
Verify Data in Last9
Log into your Last9 account and verify data is being received:
- Navigate to Last9 Dashboard
- Check Metrics section for:
sqlserver.*metrics (e.g.,sqlserver.batch.request.rate)system.*metrics (e.g.,system.cpu.utilization)
- Check Logs section for:
- SQL Server ERRORLOG entries
- Service name: “mssql-server”
Expected timeline: Data should appear within 1-2 minutes of starting the service.
Key Metrics to Monitor
Critical Performance Indicators
| Metric | Description | Alert Threshold |
|---|---|---|
sqlserver.batch.request.rate | Batch requests per second | Baseline ±50% |
sqlserver.user.connection.count | Active user connections | > 80% of max connections |
sqlserver.page.buffer_cache.hit_ratio | Buffer cache hit percentage | < 90% |
sqlserver.page.life_expectancy | Page life expectancy in seconds | < 300 seconds |
sqlserver.lock.wait_time | Lock wait time in milliseconds | > 500ms average |
Capacity Planning
| Metric | Description | Monitoring Focus |
|---|---|---|
system.cpu.utilization | CPU usage percentage | Sustained > 80% |
system.memory.utilization | Memory usage percentage | > 80% utilization |
system.filesystem.utilization | Disk space usage | > 85% full |
sqlserver.database.io.read_latency | Database read latency | > 20ms average |
sqlserver.database.io.write_latency | Database write latency | > 10ms average |
Health Monitoring
| Metric | Description | Alert Condition |
|---|---|---|
sqlserver.database.count | Number of databases | Unexpected changes |
sqlserver.deadlock.count | Deadlock occurrences | > 0 per hour |
sqlserver.error.count | SQL Server errors | Increasing trend |
system.disk.io.read_errors | Disk read errors | > 0 errors |
Troubleshooting
Service Issues
OpenTelemetry Service Won’t Start:
# Check Event Viewer for specific errorsGet-EventLog -LogName Application -Source otelcol-contrib -Newest 10
# Test configuration in foreground mode& "C:\Program Files\OpenTelemetry Collector\otelcol-contrib.exe" --config "C:\Program Files\OpenTelemetry Collector\config.yaml" --feature-gates transform.flatten.logs
# Verify config file syntax and pathTest-Path "C:\Program Files\OpenTelemetry Collector\config.yaml"SQL Server Connection Issues
Cannot Connect to SQL Server:
# Verify SQL Server is runningGet-Service MSSQLSERVER
# Check if TCP/IP is enabled and port 1433 is listeningnetstat -an | findstr 1433
# Test connection manuallysqlcmd -S localhost -U otel_monitor -P SecurePassword123! -C -Q "SELECT @@VERSION"Authentication Failures:
-- Verify Mixed Mode authentication is enabledSELECT SERVERPROPERTY('IsIntegratedSecurityOnly') AS 'Windows Authentication Only';-- Should return 0 for Mixed Mode
-- Check if monitoring user existsSELECT name, is_disabled FROM sys.server_principals WHERE name = 'otel_monitor';
-- Verify user permissionsSELECT p.state_desc, p.permission_name, s.nameFROM sys.server_permissions p LEFT JOIN sys.server_principals s ON p.grantee_principal_id = s.principal_idWHERE s.name = 'otel_monitor';Log Collection Issues
SQL Server Logs Not Being Collected:
# Verify ERRORLOG file path and permissionsGet-ChildItem "C:\Program Files\Microsoft SQL Server\MSSQL*.MSSQLSERVER\MSSQL\Log\ERRORLOG*"
# Check if service has read accessicacls "C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\Log"
# Generate test log entrysqlcmd -S localhost -U otel_monitor -P SecurePassword123! -C -Q "RAISERROR('Test log entry', 16, 1) WITH LOG;"Network and Firewall Issues
No Data Reaching Last9:
# Test outbound connectivity to Last9Test-NetConnection -ComputerName your-last9-endpoint-hostname -Port 443
# Check Windows Firewall rulesGet-NetFirewallRule -DisplayName "*SQL*" | Select-Object DisplayName, Enabled, Direction
# Verify environment variables are setGet-ChildItem Env: | Where-Object {$_.Name -like "*LAST9*"}Environment Variables Issues
Environment Variables Not Working:
# Verify environment variables are set at Machine level[Environment]::GetEnvironmentVariable("MSSQL_PASSWORD", "Machine")[Environment]::GetEnvironmentVariable("LAST9_OTLP_ENDPOINT", "Machine")
# Restart service to reload environment variablesRestart-Service otelcol-contrib
# Test with hardcoded values temporarily (remove after testing)Advanced Configuration
Multi-Instance Monitoring
Monitor multiple SQL Server instances:
receivers: sqlserver/instance1: server: localhost port: 1433 instance: MSSQLSERVER username: otel_monitor password: ${env:MSSQL_PASSWORD}
sqlserver/instance2: server: localhost port: 1434 instance: SQL2019 username: otel_monitor password: ${env:MSSQL_PASSWORD}
service: pipelines: metrics/sql1: receivers: [sqlserver/instance1] processors: [batch, transform/metrics] exporters: [otlp/last9] metrics/sql2: receivers: [sqlserver/instance2] processors: [batch, transform/metrics] exporters: [otlp/last9]Custom Log Parsing
Parse SQL Server log formats:
processors: transform/parse_sql_logs: log_statements: - context: log conditions: - body matches ".*Error.*" statements: - set(attributes["log.level"], "ERROR") - set(attributes["sql.error"], "true") - context: log conditions: - body matches ".*Login failed.*" statements: - set(attributes["log.level"], "WARN") - set(attributes["sql.login_failed"], "true")Performance Optimization
Optimize for high-volume environments:
receivers: sqlserver: collection_interval: 30s # More frequent collection
processors: batch: timeout: 10s send_batch_size: 5000 send_batch_max_size: 8000
memory_limiter: limit_mib: 512 spike_limit_mib: 128Best Practices
Security
- Environment Variables: Store all sensitive credentials in environment variables
- Least Privilege: Grant monitoring user only necessary permissions
- Network Security: Use Windows Firewall to restrict SQL Server access
- Audit Access: Monitor monitoring user access through SQL Server audit logs
Performance
- Collection Intervals: Balance monitoring frequency with performance impact
- Resource Limits: Set appropriate memory limits for the collector
- Batch Processing: Optimize batch sizes for efficient data transmission
- Log Rotation: Monitor SQL Server ERRORLOG rotation and retention
Monitoring Strategy
- Baseline Metrics: Establish performance baselines for alerting
- Multi-Layer Monitoring: Monitor application, database, and system layers
- Proactive Alerts: Set up alerts for critical metrics before problems occur
- Capacity Planning: Use trends for proactive capacity planning
Windows Administration
- Service Management: Configure services for automatic startup and recovery
- Event Log Monitoring: Regular review of Windows Event Logs
- System Updates: Keep Windows and SQL Server updated with security patches
- Backup Strategy: Ensure monitoring doesn’t interfere with backup operations
Need Help?
If you encounter any issues or have questions:
- Join our Discord community for real-time support
- Contact our support team at support@last9.io