You're running a WooCommerce store. Checkout latency spiked from 200ms to 800ms after deploying a new payment plugin. Your logs say nothing useful. You have no idea if it's the plugin, a slow database query, or a third-party API call — because WordPress doesn't tell you.
This is the default state of WordPress observability: debug.log, Query Monitor, and guesswork.
OpenTelemetry changes that. A standards-based plugin can instrument every layer of the WordPress stack — PHP execution, database queries, outbound HTTP calls, browser performance, and error events — and export the data as OTLP traces to any compatible backend. This post is a deep-dive into how that plugin works, the engineering decisions behind it, and what it takes to run it in production.
GitHub Repository: https://github.com/last9/last9-wordpress-plugin
Problem Statement
WordPress applications face a specific set of observability challenges that generic APM solutions don't handle well:
- Monolithic execution model: WordPress core, themes, and plugins all run in a single PHP process. Isolating which plugin is responsible for a slowdown requires per-plugin span attribution — something logs can't give you
- Hook system volume: A typical WordPress page fires hundreds of actions and filters. Understanding the performance impact of specific hooks requires tracing, not profiling
- Database query volume: A single page load can execute dozens to hundreds of SQL queries against
wpdb. You need per-query timing and statement capture - Server-to-browser correlation: Modern WordPress sites are JavaScript-heavy. A slow server response and a janky UI are different problems — but they need to be correlated to understand the full user experience
- Security signal: Login failures,
wp_die()calls, and auth events are meaningful signals that belong in your telemetry pipeline, not just in a syslog
Traditional logging falls short on all of these. It lacks distributed trace context, browser-side correlation, and the structure needed to query across request boundaries.
Architecture Overview
The plugin follows a modular architecture with five core components:
┌─────────────────────────────────────────────────────────────┐
│ WordPress Request │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Last9_OTel (Main Plugin) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Settings Management Layer │ │
│ │ (wp-config.php constants → WP options → defaults) │ │
│ └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Tracer │ │ Error │ │ Browser │ │ User │
│ │ │ Handler │ │ RUM │ │ Context │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────────────────────────────────────────────────┐
│ OTLP Exporter (HTTP/Protobuf) │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────┐
│ OTLP Backend │
└──────────────────┘- Settings Management (
class-last9-otel-settings.php): Hierarchical configuration with three-tier priority - PHP Tracer (
class-last9-otel-tracer.php): Server-side distributed tracing using the OpenTelemetry PHP SDK - Error Handler (
class-last9-otel-error-handler.php): PHP error and exception capture with full context - Browser Integration (
class-last9-otel-browser.php): Client-side RUM using the OpenTelemetry JavaScript SDK - User Context (
class-last9-otel-user-context.php): User identification and authentication event tracking
Core Implementation Deep-Dive
1. Hierarchical Configuration System
One of the more elegant aspects of this plugin is its three-tier configuration hierarchy.
Priority order: wp-config.php constants → WordPress options (database) → defaults
// From class-last9-otel-settings.php
public static function get(string $key, $default = null) {
// 1. Check wp-config.php constants (highest priority - for secrets)
if (isset(self::CONSTANT_MAP[$key])) {
$constant = self::CONSTANT_MAP[$key];
if (defined($constant)) {
return constant($constant);
}
}
// 2. Check WordPress options (for user-configured settings)
$settings = self::get_all();
if (isset($settings[$key])) {
return $settings[$key];
}
// 3. Return default
return $default ?? (self::DEFAULTS[$key] ?? null);
}Why this matters:
- Security: Credentials in
wp-config.phpnever touch the database and aren't exposed via the admin UI - Environment-specific config: Different OTLP endpoints for dev, staging, and prod without touching the database
- Flexibility: Non-sensitive settings can still be configured via the WordPress admin UI
The plugin also parses authorization headers in multiple formats — Header-Name: Value, Basic <token>, Bearer <token>, or bare base64 — reducing configuration friction.
2. PHP Distributed Tracing
Root Span Creation
Every HTTP request gets a root span that represents the entire request lifecycle:
// From class-last9-otel-tracer.php:start_root_span()
private function start_root_span(): void {
$request_method = $_SERVER['REQUEST_METHOD'] ?? 'GET';
$request_uri = $_SERVER['REQUEST_URI'] ?? '/';
$span_name = $request_method . ' ' . $this->get_route_name($request_uri);
$this->root_span = $this->tracer->spanBuilder($span_name)
->setSpanKind(SpanKind::KIND_SERVER)
->setStartTimestamp((int) ($this->request_start * 1_000_000_000))
->startSpan();
$this->root_span->setAttributes([
TraceAttributes::HTTP_REQUEST_METHOD => $request_method,
TraceAttributes::URL_PATH => parse_url($request_uri, PHP_URL_PATH),
TraceAttributes::URL_SCHEME => is_ssl() ? 'https' : 'http',
TraceAttributes::SERVER_ADDRESS => $_SERVER['SERVER_NAME'] ?? '',
TraceAttributes::USER_AGENT_ORIGINAL => $_SERVER['HTTP_USER_AGENT'] ?? '',
'wordpress.site_url' => get_site_url(),
]);
Context::storage()->attach($this->root_span->storeInContext(Context::getCurrent()));
}Three engineering decisions worth calling out:
- Timestamp precision: Uses
REQUEST_TIME_FLOATto capture the actual request start time before WordPress loads, giving accurate total request duration - Route normalization:
get_route_name()normalizes URLs to prevent cardinality explosion —/wp-admin/post.php?post=123becomes/wp-admin/*, REST API paths are truncated to their base - Semantic conventions: Follows OpenTelemetry HTTP semantic conventions for attribute names, making traces compatible with any OTel-aware tool
Database Query Instrumentation
WordPress's query instrumentation uses the built-in query logging mechanism — non-invasive, no wpdb monkey-patching:
// Start span when query begins
public function trace_query(string $query): string {
$span = $this->tracer->spanBuilder('db.query')
->setSpanKind(SpanKind::KIND_CLIENT)
->startSpan();
$query_type = strtoupper(strtok(trim($query), ' '));
$span->setAttributes([
TraceAttributes::DB_SYSTEM => 'mysql',
TraceAttributes::DB_OPERATION_NAME => $query_type,
'db.statement' => $this->sanitize_query($query),
]);
$this->span_stack['query_' . md5($query)] = $span;
return $query;
}
// End span when query completes
public function trace_query_result($query_data, $query, $query_time, ...) {
$key = 'query_' . md5($query);
if (isset($this->span_stack[$key])) {
$span = $this->span_stack[$key];
$span->setAttribute('db.query_time_ms', $query_time * 1000);
$span->end();
unset($this->span_stack[$key]);
}
return $query_data;
}WordPress provides actual query execution time via $query_time, so timing is accurate rather than measured at the PHP layer. The plugin sanitizes queries before export to strip sensitive values.
One known limitation: the MD5 hash approach can fail if the same query executes concurrently. In WordPress's single-threaded PHP model this is extremely rare, but a unique request-scoped ID would be more robust.
HTTP Client Tracing
WordPress makes outbound HTTP calls for plugin updates, REST API calls, and external integrations. The plugin instruments these using pre_http_request and http_api_debug filter hooks — wrapping the entire call in a span with status code capture and WP_Error detection.
WordPress Hook Tracing (Optional)
Hook tracing is disabled by default. Enabling it creates instantaneous spans for every hook execution, which is useful for debugging but creates significant cardinality. Worth noting: the spans are instantaneous because WordPress doesn't provide a "hook callback completed" event — a more intrusive approach would wrap each callback individually.
3. Error Tracking with Full Context
The error handler registers PHP error, exception, and shutdown handlers:
public function init(): void {
$this->previous_error_handler = set_error_handler([$this, 'handle_error']);
$this->previous_exception_handler = set_exception_handler([$this, 'handle_exception']);
register_shutdown_function([$this, 'handle_shutdown']);
add_action('wp_die_handler', [$this, 'get_wp_die_handler']);
}Fatal Error Capture
Fatal errors terminate the PHP process — but the shutdown handler runs regardless:
public function handle_shutdown(): void {
$error = error_get_last();
if ($error === null) return;
$fatal_types = [E_ERROR, E_PARSE, E_CORE_ERROR, E_COMPILE_ERROR];
if (!in_array($error['type'], $fatal_types, true)) return;
$fatal = [
'type' => self::ERROR_NAMES[$error['type']] ?? 'E_UNKNOWN',
'message' => $error['message'],
'file' => $error['file'],
'line' => $error['line'],
'severity' => 'fatal',
];
$this->record_error($fatal, true);
}One production caveat: BatchSpanProcessor may not flush before the process exits on a fatal error. Consider using SimpleSpanProcessor for error spans, or force-flushing in the shutdown handler.
WordPress Context on Every Error
Each error span includes: - user.id and user.roles (if logged in) - wordpress.is_admin and wordpress.admin_page (for admin errors) - wordpress.current_hook (what hook was executing when the error fired) - Request method and URI
This is the context that makes errors actionable rather than just logged.
wp_die() Integration
WordPress uses wp_die() for permission failures, AJAX errors, and plugin fatal errors. The plugin intercepts the handler to capture these as error spans — including the HTTP response code and error title.
4. Browser RUM with the OpenTelemetry JavaScript SDK
The browser integration uses the official OpenTelemetry JavaScript SDK loaded from CDN in the correct dependency order. This is a deliberate choice: rather than a custom implementation, it leverages the full OTel JS ecosystem — document load instrumentation, user interaction capture, fetch/XHR wrapping — with no bespoke code to maintain.
See Getting Started with OpenTelemetry for Browser Monitoring for how this compares to other browser observability approaches.
Configuration Injection
Configuration is injected via a JSON <script> tag before SDK initialization:
public function inject_config(): void {
$config = $this->get_browser_config();
$config_json = wp_json_encode($config,
JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP);
echo '<script id="last9-otel-config" type="application/json">'
. $config_json . '</script>';
}The JSON_HEX_* flags escape special characters to prevent XSS if configuration contains untrusted input.
Client-Side Initialization
var config = JSON.parse(
document.getElementById('last9-otel-config').textContent
);
// Client-side sampling before SDK init
if (Math.random() > config.sampleRate) return;
var exporter = new OTLPTraceExporter({
url: config.endpoint,
headers: config.headers || {}
});
var provider = new WebTracerProvider({ resource: resource });
provider.addSpanProcessor(new BatchSpanProcessor(exporter, {
maxQueueSize: 100,
maxExportBatchSize: 50,
scheduledDelayMillis: 500,
exportTimeoutMillis: 30000
}));
provider.register();Sampling happens before SDK initialization — non-sampled sessions don't pay the JS execution cost. The BatchSpanProcessor settings are conservative: 500ms delay, 50-span batches, 30s export timeout.
Trace Propagation to Backend
The fetch and XHR instrumentations use propagateTraceHeaderCorsUrls: [/.*/] to inject W3C traceparent headers into all outbound requests. When the WordPress backend is also instrumented, browser spans and server spans share a trace ID — giving you a complete picture from click to database query.
For a deeper look at how this context propagation works across service boundaries, the W3C Trace Context spec defines the exact header format.
Web Vitals Collection
The plugin captures all three Core Web Vitals as OTel spans. The CLS implementation is worth looking at — it accumulates across the page lifetime and reports on pagehide:
var clsValue = 0;
new PerformanceObserver(function (list) {
list.getEntries().forEach(function (entry) {
if (!entry.hadRecentInput) {
clsValue += entry.value;
}
});
}).observe({ type: 'layout-shift', buffered: true });
window.addEventListener('pagehide', function () {
tracer.startSpan('browser.web_vital.cls', {
attributes: { 'web_vital.name': 'CLS', 'web_vital.value': clsValue }
}).end();
});pagehide fires when the user navigates away or closes the tab — the only reliable way to get the final CLS value. For more on what to track and why, see RUM Metrics Explained.
The plugin also captures detailed navigation timing — DNS lookup, TCP connect, TTFB, DOM interactive, First Paint, FCP — as attributes on a browser.performance span.
5. User Context and Security Events
User Attribute Enrichment
Each span gets enriched with user context:
public function get_user_attributes(): array {
if (!is_user_logged_in()) {
return ['user.authenticated' => false];
}
$user = wp_get_current_user();
return [
'user.authenticated' => true,
'user.id' => (string) $user->ID,
'user.roles' => implode(',', $user->roles),
'user.capabilities_count' => count(array_filter($user->allcaps)),
];
}GDPR note: user.id and user.name are PII. Use the provided filter to hash or remove them:
add_filter('last9_otel_user_attributes', function($attrs) {
$attrs['user.id'] = hash('sha256', $attrs['user.id']);
unset($attrs['user.name']);
return $attrs;
});Security Event Tracking
Login failures are tracked with the client IP, username attempted, and error codes:
public function track_login_failed(string $username, \WP_Error $error): void {
$attributes = [
'user.attempted_username' => $username,
'event.outcome' => 'failure',
'error.code' => implode(',', $error->get_error_codes()),
'client.address' => $this->get_client_ip(),
];
$this->record_event('user.login_failed', $attributes);
}Query for high volumes of user.login_failed from the same client.address to detect brute force. Correlate multiple failed usernames from the same IP to detect credential stuffing.
Production Considerations
Performance Impact
Test setup: WordPress 6.4.3, PHP 8.2, 50 database queries per page, 2 HTTP requests per page (1000-request average).
| Configuration | Latency | Memory | Overhead |
|---|---|---|---|
| No plugin | 245ms | 45MB | baseline |
| Plugin (no sampling) | 253ms | 47MB | +3.3% / +4.4% |
| Plugin (10% sampling) | 246ms | 45MB | +0.4% / +0% |
| Plugin (hooks enabled) | 312ms | 52MB | +27% / +15% |
With sampling enabled, overhead is negligible. Hook tracing should stay disabled in production.
Sampling Strategies
// wp-config.php
define('LAST9_OTEL_TRACE_SAMPLE_RATE', 0.1); // 10% of PHP requests
define('LAST9_OTEL_BROWSER_SAMPLE_RATE', 0.05); // 5% of page viewsStart at 100% sampling for low-traffic sites (< 1000 req/min). For high-traffic: - PHP tracing: 10-20% - Browser RUM: 5-10% - Errors: 100% (implement separate error sampling only if volume is extreme)
Security Hardening
Always use wp-config.php constants for credentials — never store them in the WordPress options table. Add custom sanitization for domain-specific query patterns:
add_filter('last9_otel_sanitize_query', function($query) {
return preg_replace("/secret='[^']*'/", "secret='***'", $query);
});Graceful Degradation
The plugin wraps all OTel operations in try-catch and sets $sdk_available = false on initialization failure. Observability should never take down the site.
Implementation Patterns
Span Lifecycle via Stack
WordPress's hook system doesn't provide before/after pairs for all operations. The span_stack pattern solves this — start a span in one hook, end it in another, clean up any orphaned spans on shutdown:
// Start: store span by key
$this->span_stack['query_' . md5($query)] = $span;
// End: retrieve and close
if (isset($this->span_stack[$key])) {
$this->span_stack[$key]->end();
unset($this->span_stack[$key]);
}
// Shutdown: close any spans that never completed
public function shutdown(): void {
foreach ($this->span_stack as $span) {
if ($span instanceof SpanInterface) {
$span->end();
}
}
}Filter-Based Extensibility
Every significant decision point exposes a WordPress filter:
// Modify resource attributes
add_filter('last9_otel_resource_attributes', function($attributes) {
$attributes['custom.version'] = '2.0.0';
return $attributes;
});
// Suppress errors from known-noisy plugins
add_filter('last9_otel_ignore_error', function($ignore, $message, $file) {
if (strpos($file, 'legacy-plugin') !== false) {
return true;
}
return $ignore;
}, 10, 3);This follows WordPress's own extensibility philosophy — customize behavior without forking the plugin.
Known Limitations and Future Work
- Outbound trace context injection: The plugin doesn't automatically inject
traceparentheaders into WordPress HTTP API calls. A filter onhttp_request_argswould enable end-to-end distributed tracing to external services - Plugin attribution: Current spans don't identify which WordPress plugin triggered them. A backtrace-based approach could add
wordpress.pluginas a span attribute — useful for isolating slow plugins - Query cardinality: Dynamic queries (search, filtered queries) can have high cardinality. Consider parameterizing query text or using fingerprinting
- Metrics support: The plugin exports traces only. Adding OTel metrics (request rate, error rate, latency percentiles) would complete the picture
- Log export: OpenTelemetry's log signal would allow routing WordPress
error_log()output as structured OTLP logs, closing the three-pillar loop
Deployment
cd wp-content/plugins/
git clone https://github.com/last9/last9-wordpress-plugin.git last9-otel
cd last9-otel
composer install --no-dev --optimize-autoloader
wp plugin activate last9-otelMinimal wp-config.php configuration:
define('LAST9_OTEL_ENDPOINT', 'https://otlp.last9.io');
define('LAST9_OTEL_AUTH_HEADER', 'Basic YOUR_CREDENTIALS');
define('LAST9_OTEL_SERVICE_NAME', 'wordpress-production');
define('LAST9_OTEL_ENVIRONMENT', 'production');
define('LAST9_OTEL_TRACE_SAMPLE_RATE', 0.1);
define('LAST9_OTEL_BROWSER_SAMPLE_RATE', 0.05);
define('LAST9_OTEL_ENABLE_HOOK_TRACING', false);Roll out gradually: staging at 100% → production at 1% → production at 10% → enable browser RUM. Monitor wp-content/debug.log for OTel errors at each stage.
Get Full Observability for Your WordPress Stack
If you're running WordPress at scale, the patterns here give you the instrumentation layer. For the backend, Last9 accepts OTLP directly — no Collector required for basic setups — and handles the cardinality that comes with per-user, per-plugin, and per-query span attributes.
The plugin source is at github.com/last9/last9-wordpress-plugin.
